State estimation of electrical power networks using semidefinite relaxation

ABSTRACT

A semidefinite (SDR) programming formulation for state estimation (SE) of nonlinear AC power systems is described. The techniques make use of convex semidefinite relaxation of the original problem to render the process efficiently solvable. In addition, robust techniques are described that are resilient to outlying measurements and/or adversarial cyber-attacks. Further, techniques for SDR-based SE are described in which local control areas solve the centralized SDP-based SE problem in a distributed fashion.

This application claims the benefit of U.S. Provisional Application No. 61/623,943, filed Apr. 13, 2012, the entire content of which is incorporated herein by reference.

TECHNICAL FIELD

This disclosure relates to electrical power systems, and more specifically to monitoring electrical power systems.

BACKGROUND

The electric power grid is a complex system consisting of multiple subsystems, each with a transmission infrastructure spanning over a huge geographical area, transporting energy from generation sites to distribution networks. Monitoring the operational conditions of grid transmission networks is of paramount importance to facilitate system control and optimization tasks, including security analysis and economic dispatch under security constraints.

SUMMARY

In general, this disclosure is directed to monitoring techniques that can estimate system state of electrical power busses within large-scale power grids. In some examples, as further explained below, multi-area state estimation may be performed in a distributed fashion for multiple interconnected “subgrids.” In addition, robust techniques are described that are resilient to outlying measurements and/or adversarial cyber-attacks.

In one embodiment, a method for estimating a state for each of a plurality of alternating current (AC) electrical power buses within a power grid includes receiving, with an energy management system for the power grid, measurements of an electrical characteristic from a plurality of measurement units (MUs) positioned at a subset of the power buses within the power grid. The method further includes processing the measurements of the electrical characteristic with the energy management system to compute an estimate of the electrical characteristic at each of the power buses within the power grid, the estimates of the electrical characteristic being nonlinearly related to the measurements of the electrical characteristic, wherein processing the measurements of the electrical characteristic comprises applying semidefinite relaxation to compute a semidefinite programming model for the power buses within the power grid that linearly relates the estimates for the electrical characteristic to the measurements of the electrical characteristic, and iteratively processing the measurements of the electrical characteristic in accordance with the semidefinite programming model to compute a convex minimization as a solution for the estimates of the electrical characteristic.

In another embodiment, a device includes memory to store measurements of an electrical characteristic from a plurality of measurement units (MUs) positioned at a subset of alternating current (AC) electrical power buses within a power grid, and one or more processors configured to execute program code to process the measurements of the electrical characteristic to compute an estimate of the electrical characteristic at each of the power buses within the power grid, the estimates of the electrical characteristic being nonlinearly related to the measurements of the electrical characteristic, wherein the program code is configured to apply semidefinite relaxation to compute a semidefinite programming model for the power buses within the power grid that linearly relates the estimates for the electrical characteristic to the measurements of the electrical characteristic, and iteratively process the measurements of the electrical characteristic in accordance with the semidefinite programming model to compute a convex minimization as a solution for the estimates of the electrical characteristic.

In this disclosure, the following notation is used: Upper (lower) boldface letters will be used for matrices (column vectors); (•)T denotes transposition; (•)H complex-conjugate transposition; Re(•) the real part; Im(•) the imaginary part; Tr(•) the matrix trace; rank(•) the matrix rank; 0 the all-zero matrix; |•|p the vector p-norm for p≧1; and |•| the magnitude of a complex number.

The details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 illustrates and example system that implements techniques described herein to estimate system state of electrical power busses within large-scale power grids.

FIG. 2 illustrates an IEEE 30-bus system commonly used for power system testing.

FIGS. 3A1, 3A2, 3B1 and 3B2 compare estimation errors in voltage magnitudes and angles between semidefinite relaxation (SDR) and weighted least-squares (WLS) solvers at different buses.

FIGS. 4A1, 4A2, 4B1 and 4B2 are graphs that compare estimation errors in voltage magnitudes and angles between SDR and WLS solvers.

FIGS. 5A and 5B are graphs that compare estimation errors between SDR and WLS solvers versus the number of PMUs for angle estimates and magnitude estimates.

FIG. 6 is a block diagram that shows another example network of multiple interconnected electrical power systems.

FIGS. 7A and 7B are graphs that compare estimation errors in voltage magnitudes and angles between SDR and WLS solvers at different buses for robust semidefinite estimation.

FIG. 8 illustrates another example system that implements techniques described herein to provide local estimates for interconnected power grids in a distributed fashion.

FIG. 9A is a graph that plots local matrix error versus an iteration index for distributed semidefinite estimation.

FIG. 9B is a graph that plots local estimation error versus an iteration index for distributed semidefinite estimation.

FIG. 10 is a flowchart illustrating example operation of a control device, such as an energy management system of FIG. 1 or an energy management system for any of the control areas of FIG. 8.

FIG. 11 is a block diagram of an example computing device operable to execute one or more of the state estimation techniques described herein.

DETAILED DESCRIPTION

FIG. 1 illustrates and example system 10 that implements techniques described herein to estimate system state of electrical power busses within large-scale power grids, such as power grid 12. In the example of FIG. 1, power grid 12 includes a plurality of N electrical buses 14 interconnected by transmission lines 15. Power flow meters 16 and power injection meters 17 are example measurement units that provide measurement for various system variables at a selected subset of transmission lines 15 within power grid 12. Power flow meters 16 and power injection meters 17 transmit the measurements to energy management system 12 for estimation of system state variables, such as complex bus voltages, at all transmission lines for electrical buses 14 throughout power grid 12. For example, power flow meters 15 measure how much power is flowing on a given transmission line 15. Power flow meters 16 measure how much power generation/load is currently occurring at each corresponding electric bus 14. In some examples, as further explained below, multi-area state estimation may be performed in a distributed fashion for multiple interconnected “subgrids.” In addition, robust techniques are described that are resilient to outlying measurements and/or adversarial cyber-attacks.

The present disclosure leverages physical properties of AC power systems in order to develop polynomial-time SE algorithms, which offer the potential to find a globally optimal state estimate. Challenged by the nonconvexity in SE, the approaches herein may make use of a technique called semidefinite relaxation (SDR) to relax an otherwise nonconvex problem to a semidefinite programming (SDP) one. As described herein, SDR may provide a versatile optimization technique for solving nonconvex problems associated with state estimation of the bus voltages at each of electrical buses 14.

Energy management system 12 may comprise one or more controllers or computers programmed to take certain action in response to computing estimates for the electrical characteristic for all of buses 14 within power grid 12. For example, energy management system 12 may control operation of one or more of electrical buses 14 based on the computed to estimates. As another example, energy management system 12 may generate reports that specify power consumption at each of buses 14 within power grid 12 based on the computed estimates. Energy management system 12 may output communications to automatically update an accounting system to charge respective operators of each power buses 14 based on the computed estimates. Energy management system 12 may generate alerts based on the computed estimates.

As used herein, the state estimation (SE) task for power systems refers to the process of acquiring estimates of system variables, such as the voltage phasors at all buses 14, in the power grid 12. This is inherently a nonconvex problem giving rise to many local optima due to the nonlinear coupling present among legacy meter measurements provided by some of power flow meters 16. Specifically, SE falls under the class of nonlinear (weighted) least-squares (LS) problems, for which the Gauss-Newton method is the “workhorse” solver. This iterative scheme has also been reckoned as the algorithmic foundation of SE. Using the Taylor expansion around an initial point, the Gauss-Newton method approximates the nonlinear LS cost by a linear one, whose minimum is used for the ensuing iteration. Since this iterative procedure is in fact related to gradient descent solvers for nonconvex problems, it inevitably faces challenges pertaining to sensitivity of the initial guess and convergence issues. Without guaranteed convergence to the global optimum, existing variants have asserted numerical stability, but are restrained to improving the linearized LS cost per iteration. Linear state measurements offered by synchronized phasor measurement units (PMUs) can be incorporated as some of power flow meters 16. However, limited PMU deployment may, in some implementations, confine SE to mostly rely on the nonlinear legacy meter measurements, and its companion Gauss-Newton iterative methods.

A power transmission network (power grid 12) typically includes a plurality of AC power subsystems 14 (referred to herein as electrical buses) that are interconnected to distribute electrical power. A power grid of such subsystems is typically managed by one or more regional transmission organizations (RTOs) or independent system operators (ISOs). Consider power grid 12 of FIG. 1 with N electrical buses denoted by the set of nodes

:={1, . . . , N}, and transmission lines 15 represented by the set of edges

:={(n,m)}

×

To estimate the complex voltage V_(n) at each bus n∈

, measurements are taken for a subset of the following system variables:

-   -   P_(n)(Q_(n)): the real (reactive) power injection at bus n         (negative if bus n is connected to a load);     -   P_(mn)(Q_(mn)): the real (reactive) power flow from bus m to bus         n; and     -   |V_(n)|: the voltage magnitude at bus n.         Compliant with the well-known AC power flow model, these         measurements are nonlinearly related with the power system state         of interest, namely the voltage vector v:=[V₁, . . . ,         V_(N)]^(T)∈         ^(N). To specify this relationship, similarly collect the         injected currents of all buses in i:=[I₁, . . . , I_(N)]^(T)∈         ^(N), and let Y∈         ^(N×N) represent the grid's symmetric bus admittance matrix.         Kirchoff's law in vector-matrix form dictates i=Yv, where the         (m,n)-th entry of Y is given by

$\begin{matrix} {Y_{m\; n}:=\left\{ {\begin{matrix} {{- y_{m\; n}};} & {{{if}\mspace{14mu}\left( {m;n} \right)} \in} \\ {{y_{n\; n} + {\sum\limits_{v \in {??}_{n}}y_{n\; v}}};} & {{{if}\mspace{14mu} m} = n} \\ {0;} & {otherwise} \end{matrix},} \right.} & (1) \end{matrix}$ with y_(mn) denoting the line admittance between buses m and n, y_(nn) bus n's admittance to the ground, and

the set of all buses linked to bus n through transmission lines.

Letting y _(mn) stand for the shunt admittance at bus m associated with the line (m,n), the current flowing from bus m to n is I_(mn)=y _(mn)V_(m)+y_(mn)(V_(m)−V_(n)). In addition to currents, the AC power flow model further asserts that the apparent power injection into bus n is given by P_(n)+jQ_(n)=V_(n)

, while the apparent power flow from bus m to bus n by P_(mn)+jQ_(mn)=V_(m)

. Finally, expressing the squared bus voltage magnitude as |V_(n)|²=V_(n)

, it is clear that all measurable quantities listed earlier are nonlinearly (in fact quadratically) related to the state v.

Collect these (possibly noisy) measurements in the L×1 vector z:=[{{hacek over (P)}_(n)}_(n∈N) _(P) , {{hacek over (Q)}_(n)}_(n∈N) _(Q) , {{hacek over (P)}_(mn)}_((m,n)∈ε) _(P) , {{hacek over (Q)}_(mn)}_((m,n)∈ε) _(Q) , {|{hacek over (V)}_(n)|²}_(n∈N) _(V) ]

, where the check mark differentiates measured values from the error-free variables. For consistency with other measurements, the squared magnitude |V_(n)|² is considered from now on. This is possible by adopting the model |{hacek over (V)}_(n)|=|V_(n)|+σ_(V), where ε_(V) is zero-mean Gaussian with small variance σ_(V) ², to an approximate model for the squared magnitude; namely, |{hacek over (V)}_(n)|²≈|V_(n)|²+ε′_(V), where ε′_(V) has variance 4|{hacek over (V)}_(n)|²σ_(V) ². The l-th entry of z can be written as z_(l)=h_(l)(v)+ε_(l), where h_(l)(•) denotes the nonlinear relationship specified according to the aforementioned AC power flow equations. The zero-mean Gaussian error ε_(l) at the l-th meter is assumed independent across meters with variance σ_(l) ². Due to the independence among errors, the maximum-likelihood (ML) criterion for estimating v boils down to the weighted least-squares (WLS) one, yielding the optimal state estimator as

$\begin{matrix} {{\hat{v}:={\arg\;{\min\limits_{v}{\sum\limits_{l = 1}^{L}{w_{l}\left\lbrack {z_{l} - {h_{l}(v)}} \right\rbrack}^{2}}}}},} & (2) \end{matrix}$ where w_(l):=1/σ_(l) ²∀l. The nonlinear WLS SE formulation in (2) is clearly nonconvex.

The Gauss-Newton iterative solver for nonlinear WLS problems has been widely used for SE. Using Taylor's expansion around a given starting point, the pure form of Gauss-Newton methods approximates the cost in (2) with a linear WLS one, and relies on its minimizer to initialize the subsequent iteration. This iterative procedure is closely related to gradient descent algorithms for solving nonconvex problems, which are known to encounter two issues: i) sensitivity to the initial guess; and ii) convergence concerns.

Typical WLS-based SE iterations start with a flat voltage profile, where all bus voltages are initialized with the same real number. Unfortunately, this may not guarantee convergence to the global optimum. Existing variants have asserted improved numerical stability, but they are all limited to improving the approximate WLS cost per iteration. Recently, due to rapid developments in PMU technology, SE has benefited greatly by including the synchrophasor data, which adhere to linear measurement models with respect to (wrt) the unknown v. Nonetheless, challenges emerging from the nonlinearity of legacy measurements must be addressed in the resultant PMU-aided SE methods.

In summary, one challenge so far has been to develop a solver attaining or approximating the global optimum at polynomial-time complexity. The next section addresses this challenge by appropriately reformulating SE to apply the semidefinite relaxation (SDR) technique.

Consider first expressing each quadratic measurement in z linearly in terms of the outer-product matrix V:=

To this end, let {e_(n)}_(n=1) ^(N) denote the canonical basis of

^(N), and define the following admittance-related matrices Y _(n) :=e _(n) e _(n) ^(T) Y  (3a) Y _(mn):=( y _(mn) +y _(mn))e _(m)

−y _(mn) e _(m)

  (3b) and their related Hermitian counterparts

$\begin{matrix} {{H_{P,n}:={\frac{1}{2}\left( {Y_{n} + Y_{n}^{H}} \right)}},{H_{Q,n}:={\frac{j}{2}\left( {Y_{n} - Y_{n}^{H}} \right)}}} & \left( {4a} \right) \\ {{H_{P,{m\; n}}:={\frac{1}{2}\left( {Y_{m\; n} + Y_{m\; n}^{H}} \right)}},{H_{Q,{m\; n}}:={\frac{j}{2}\left( {Y_{m\; n} - Y_{m\; n}^{H}} \right)}}} & \left( {4b} \right) \\ {H_{V,n}:={e_{n}{e_{n}^{T}.}}} & \left( {4c} \right) \end{matrix}$ Using these definitions, the following lemma can be proved to establish a linear model in the complex V.

Lemma 1:

All error-free measurement variables are linearly related with the outer-product V as P _(n) =Tr(H _(P,n) V), Q _(n) =Tr(H _(Q,n) V)  (5a) P _(mn) =Tr(H _(P,mn) V), Q _(mn) =Tr(H _(Q,mn) V)  (5b) |V _(n)|² =Tr(H _(V,n) V).  (5c) Thus, the noisy meter measurement z_(l) can be written as z _(l) =h _(l)(v)+ε_(l) =Tr(H _(l) V)+ε_(l)  (6) where H_(l) is a Hermitian matrix specified in accordance with (4a)-(4c).

Proof: To establish (5a), use the injected power flow equation, and successively Kirchoff's law and (3a) to obtain P_(n)+jQ_(n)=V_(n)

=(

I_(n))

=(

e_(n)e_(n) ^(T)i)

=(

Y_(n)v)

=

v=Tr(

V). Hence, P_(n) and Q_(n) are related to V using the real and imaginary parts of Y_(n) ^(H), respectively, as asserted in (5a). Likewise, (5b) follows after replacing with the line power flow equation and (4b), while |V_(n)|² is naturally the (n, n)-th entry of V as in (5c).

Lemma 1 yields the following equivalent reformulation of (2)

$\begin{matrix} {{\hat{V}}_{1}:={\arg\;{\min\limits_{V \in C^{N \times N}}{\sum\limits_{l = 1}^{L}{w_{l}\left\lbrack {z_{l} - {{Tr}\left( {H_{l}V} \right)}} \right\rbrack}^{2}}}}} & \left( {7a} \right) \\ {{{{s.{to}}\mspace{14mu} V} \succcurlyeq 0},{{{and}\mspace{14mu}{{rank}(V)}} = 1}} & \left( {7b} \right) \end{matrix}$ where the positive semi-definiteness and rank constraints jointly ensure that for any V admissible to (7b), there always exists a v∈C^(N) such that V=

.

Albeit the linearity between z_(l) and V in the new formulation (7), nonconvexity is still present in two aspects: i) the cost in (7a) has degree 4 wrt the entries of V; and ii) the rank constraint in (7b) is nonconvex. Aiming for an SDP formulation of (7), Schur's complement lemma can be leveraged to convert the summands in (7a) to a linear cost over an auxiliary vector χ∈

^(L). Specifically, with w:=[w₁, . . . , w_(L)]

and likewise for χ, consider the second SE reformulation as:

$\begin{matrix} {\left\{ {{\hat{V}}_{2},{\hat{\chi}}_{2}} \right\}:={{\arg\;{\min\limits_{V,\chi}{\sum\limits_{l = 1}^{L}{w_{l}\chi_{l}}}}} = {\arg\;{\min\limits_{V,\chi}{w^{T}\chi}}}}} & \left( {8a} \right) \\ {{{{s.{to}}\mspace{14mu} V} \succcurlyeq 0},{{{and}\mspace{14mu}{{rank}(V)}} = 1}} & \left( {8b} \right) \\ {\begin{bmatrix} {- \chi_{l}} & {z_{l} - {{Tr}\left( {H_{l}V} \right)}} \\ {z_{l} - {{Tr}\left( {H_{l}V} \right)}} & {- 1} \end{bmatrix} \preceq {0\mspace{11mu}{\forall{l.}}}} & \left( {8c} \right) \end{matrix}$

Proposition 1:

All three nonconvex optimization problems in (2), (7), and (8) solve an equivalent SE problem under the AC flow model. For the optima of these problems, it holds that {circumflex over (V)} ₁ ={circumflex over (V)} ₂=

and {circumflex over (χ)}_(2,l) =[l _(l) −Tr(H _(l) {circumflex over (V)} ₂)]² ∀l.  (9)

Proof:

First, notice that the outer-product {circumflex over (v)}{circumflex over (v)}^(H) is feasible for (7). Expressing the rank-1 matrix as {circumflex over (V)}₁={circumflex over (v)}₁{circumflex over (v)}₁ ^(H) renders the complex {circumflex over (v)}₁ also feasible for (2). As Lemma 1 establishes the equivalence between the costs in (2) and (7a), their solutions can be related henceforth.

For the equivalence between (7) and (8), Schur's complement lemme for (8c) ensures that χ_(l)≧[z_(l)−Tr(H_(l)V)]² ∀l. Since (8a) minimizes a positively weighted sum of {X_(l)}, the equality is further guaranteed at the optimum; i.e., {circumflex over (χ)}_(2,l)=[z_(l)−Tr(H_(l){circumflex over (V)}₂)]² ∀l. Substituting this back to (8a) shows that {circumflex over (V)}₁ and {circumflex over (V)}₂ achieve the same minimum cost, which completes the proof.

Remark 1 (Line Current Magnitude Measurements).

For certain distribution-level SE problems, line current magnitude measurements |{hacek over (I)}_(mn)| are also available. It is worth pointing out that all SE reformulations can also handle this type of measurements. Since I_(mn) is linear in v, its magnitude squared |I_(mn)|² will be quadratic in v, and likewise linear in V as in (6). Hence, the equivalence in Proposition 1 and the ensuing analysis apply even with current magnitude measurements, which for brevity are henceforth omitted.

Convexifying SE Via SDR

Proposition 1 established the relevance of the novel SE formulation (8), which is still nonconvex though, due to the rank-1 constraint. Fortunately, problem (8) is amenable to the SDR technique, which amounts to dropping the rank constraint and has well-appreciated merits as an optimization tool. The SDR technique has also recently provided new perspectives for a number of nonconvex problems in the field of signal processing and communications, thanks to its provable performance guarantees and implementation advantages. The contribution here consists in permeating the benefits of this powerful optimization tool to estimating the state of AC power systems. In the spirit of SDR, relaxing the rank constraint in (8b) leads to the following SDP

$\begin{matrix} {\left\{ {\hat{V},\hat{\chi}} \right\}:={\arg\;{\min\limits_{V}{w^{T}\chi}}}} & \left( {10a} \right) \\ {{{{s.{to}}\mspace{14mu} V} \succcurlyeq 0}\;} & \left( {10b} \right) \\ {\begin{bmatrix} {- \chi_{l}} & {z_{l} - {{Tr}\left( {H_{l}V} \right)}} \\ {z_{l} - {{Tr}\left( {H_{l}V} \right)}} & {- 1} \end{bmatrix} \preceq {0\mspace{11mu}{\forall{l.}}}} & \left( {10c} \right) \end{matrix}$

SDR endows SE with a convex SDP formulation for which efficient schemes are available to obtain the global optimum using, e.g., the interior-point solver SeDuMi. The worst-case complexity of this SDP problem is O(L⁴√{square root over (N)} log(1/ε)) for a given solution accuracy ε>0. For typical power networks, L is in the order of N, and thus the worst-case complexity becomes O(N^(4.5) log(1/ε)). In addition, it is possible to leverage the special problem structure to further reduce complexity. Indeed, all matrices in (4) are markedly sparse. For example, only the (n,n)-th entry of H_(V,n) is non-zero. A closer look reveals that all {H_(l)} matrices have non-zero entries only at their diagonal entries, and those (m,n)-th off-diagonal ones that correspond to transmission lines in

. This can be used to exploit the so-called “chordal” data structure of V, which has led to major computational savings for solving the SDR-based OPF problem. Another special structure relates to the low rank of {H_(l)} matrices, which could greatly simplify the Schur complement matrix computation step of interior-point methods.

Nonetheless, the SDP problem (10) is only a relaxed version of the equivalent SE in (8); hence, its solution {circumflex over (V)} may have rank greater than 1, which makes it necessary to recover a feasible estimate {circumflex over (v)} from {circumflex over (V)}. This is possible by eigen-decomposing {circumflex over (V)}=Σ_(i=1) ^(r)λ_(i)u_(i)

, where r:=rank({circumflex over (V)}), λ₁≧ . . . ≧λ_(r)>0 denote the positive ordered eigenvalues, and {u_(i)∈

^(N)}_(i=1) ^(r) are the corresponding eigenvectors. Since the best (in the minimum-norm sense) rank-one approximation of ^V is λ₁u₁

, the state estimate can be chosen equal to {circumflex over (v)}(u₁):=√{square root over (λ₁)}u₁. Besides this eigenvector approach, randomization offers another way to extract an approximate SE vector from {circumflex over (V)}, with quantifiable approximation accuracy. The basic idea is to generate multiple Gaussian distributed random vectors v˜C

(0, {circumflex over (V)}), and pick the one with the minimum WLS cost. Note that although any vector v is feasible for (2), it is still possible to decrease the minimum achievable cost by rescaling to obtain {circumflex over (v)}(v)=ĉv, where the optimal weight can be chosen as the solution to the following convex problem as

$\begin{matrix} {\hat{c} = {{\arg\;{\min\limits_{c > 0}{\sum\limits_{l = 1}^{L}{w_{l}\left\lbrack {z_{l} - {c^{z}v^{H}H_{l}v}} \right\rbrack}^{2}}}} = {\sqrt{\frac{\sum\limits_{l = 1}^{L}{w_{l}z_{l}v^{H}H_{l}v}}{\sum\limits_{l = 1}^{L}{w_{l}\left( {v^{H}H_{l}v} \right)}^{2}}}.}}} & (11) \end{matrix}$

Remark 2 (Reference Bus).

For power system SE, the reference bus convention is adopted, and the corresponding bus voltage angle is set to 0. As all measurements in (6) are quadratically related to v, the outer-product (e^(jΘ)v)(e^(jΘ)v)

=

remains invariant to phase rotation Θ∈[−π,π]. To account for this, once an estimate {circumflex over (v)} is recovered, it can be rotated by multiplying with {circumflex over (V)}_(ref) ^(H)/|{circumflex over (V)}_(ref)|, where {circumflex over (V)}_(ref) denotes the estimated reference-bus voltage.

Dual Equivalence

This section relates the relaxed and unrelaxed SE problems through the equivalence of their dual problems. Consider the dual of (10) by defining the Lagrange multiplier matrix associated with the l-th inequality in (10c) as

$\begin{matrix} {\mu_{l}:={\begin{bmatrix} \mu_{l,0} & \mu_{l,1} \\ \mu_{l,1} & \mu_{l,2} \end{bmatrix} \succcurlyeq 0.}} & (12) \end{matrix}$ Using (12), the Lagrangian corresponding to (10) becomes:

$\begin{matrix} \begin{matrix} {{\mathcal{L}\left( {V,\chi,\left\{ \mu_{l} \right\}} \right)}:={{w^{T}\chi} + {\sum\limits_{l = 1}^{L}{{Tr}\left\{ {\begin{bmatrix} {- \chi_{l}} & {z_{l} - {{Tr}\left( {H_{l}V} \right)}} \\ {z_{l} - {{Tr}\left( {H_{l}V} \right)}} & {- 1} \end{bmatrix}\mu_{l}} \right\}}}}} \\ {= {\sum\limits_{l = 1}^{L}{\left\{ {{\left( {w_{l} - \mu_{l,0}} \right)\chi_{l}} - \mu_{l,2} + {2\left\lbrack {{\mu_{l,1}z_{l}} - {{Tr}\left( {\mu_{l,1}H_{l}V} \right)}} \right\rbrack}} \right\}.}}} \end{matrix} & (13) \end{matrix}$

The dual problem amounts to maximizing over μ_(l)

0, a cost equal to the minimum Lagrangian L(V, χ, {μ_(l)}) over both χ and V

0. This minimum is attained when ω_(l)−μ_(l,0)=0 and −2Σ_(l=1) ^(L)μ_(l,1)H_(l)

0, which leads to the dual problem formulation as

$\begin{matrix} {\left\{ {\hat{\mu}}_{l} \right\rbrack_{l = 1}^{L}:={\arg\;{\min\limits_{\mu_{l}}{\sum\limits_{l = 1}^{L}\left( {\mu_{l,2} - {2\mu_{l,1}z_{l}}} \right)}}}} & \left( {14a} \right) \\ {{{{s.{to}}\mspace{14mu}\mu_{l}} = {\begin{bmatrix} w_{l} & \mu_{l,1} \\ \mu_{l,1} & \mu_{l,2} \end{bmatrix} \succcurlyeq 0}},{\forall l}} & \left( {14b} \right) \\ {A:={{2{\sum\limits_{l = 1}^{L}{\mu_{l,1}H_{l}}}} \preceq 0.}} & \left( {14c} \right) \end{matrix}$

Proposition 2:

The SDP problem in (14) is the dual of both the SE in (8) and the SDR-based SE in (10). Strong duality holds between (14) and (10), while the primal variable V

0 corresponds to the Lagrange multiplier associated with (14c).

Proof:

Given how Tr(AV) appears in the Lagrangian L, it follows that V is the Lagrange multiplier for the dual constraint A

0 in (14c). Strong duality between these two convex problems holds because the primal SE problem admits a strictly feasible V=I and χ_(l)=(z_(l)−Tr(H_(l)))²+1 ∀l. To show that (14) is also the dual of the original problem in (8), recall that (8b) is equivalent to having V=

for some v. Therefore, the corresponding Lagrangian of the SE problem (8) can be obtained by substituting V's outer-product form into (13). Interestingly, the minimum of this new Lagrangian over χ and v is no different from the one in (13), and thus the same dual problem in (14) follows.

The strong duality asserted by Proposition 2 implies that the SDR-based SE problem (10) is also the dual of (14), and thus Lagrangian bidual of the original SE (8). Hence, apart from the rank relaxation interpretation in the primal domain, this provides another interpretation based on the Lagrangian dual equivalence. Additional complexity reduction is envisioned from this dualization. The dual SE problem (14) entails 2 L unknown variables, where for typical power networks the number of measurements L is in the order of the number of lines |

|, which is of order N. Compared to the N×N complex matrix V in the primal SDP (10), the dual one has considerably less variables to optimize over.

PMU-Aided SDR-Based SE

Recent deployment of PMUs suggests complementing with PMU data, the measurements collected by legacy meters to perform SE. Compared to legacy measurements, PMUs provide synchronous data that are linear functions of the state v. If bus m is equipped with a PMU, then its voltage phasor V_(m) and related current phasors {I_(mn)}_(n∈)

are available to the control center with high accuracy. Hence, with adequate number of PMUs and wisely chosen placement buses, SE using only PMU data boils down to estimating a linear regression coefficient vector for which a batch WLS solution is available in closed form. However, installation and networking costs involved allow only for limited penetration of PMUs in the near future. This means that SE may need to be performed using jointly legacy meters and PMU measurements.

To this end, let ζ_(m)=Φ_(m)v+ε_(m) collect the noisy PMU data at bus m, where the linear regression matrix Φ_(m) is constructed in accordance with the bus index m and line admittances, while ε_(m) denotes the PMU measurement noise, assumed to be complex zero-mean Gaussian distributed with covariance 2{hacek over (σ)}_(m) ²I, independent across buses and from the legacy meter noise terms {ε_(l)}. The SE task now amounts to estimating v given both z and {ζ_(m)}_(m∈)

where

denotes the PMU-instrumented set of buses. Hence, the ML-optimal WLS cost in (2) must be augmented with the log-likelihood induced by PMU data, as

$\begin{matrix} {\hat{v}:={{\arg\;{\min\limits_{v}{\sum\limits_{l = 1}^{L}{w_{l}\left\lbrack {z_{l} - {h_{l}(v)}} \right\rbrack}^{2}}}} + {\sum\limits_{m \in \;{??}}{\omega_{m}{{\zeta_{m} - {\Phi_{m}v}}}_{2}^{2}}}}} & (15) \end{matrix}$ where ω_(m):=1/{hacek over (σ)}_(m) ² ∀m∈

. The augmented SE problem (15) is still nonconvex due to the quadratic dependence of legacy measurements in the wanted state v.

Existing SE methods that account for PMU measurements can be categorized in two groups. The first one includes the so-termed hybrid SE approaches which utilize both PMU and legacy measurements in a WLS solver via iterative linearization. Depending on the number of PMUs, the state can be either expressed using polar coordinates (similar to traditional WLS-based SE), or by rectangular coordinates (as the notation v here). The polar representation is preferred when legacy measurements are abundant, because it requires minor adaptations of the existing WLS-based SE. On the other hand, the polar representation is less powerful when it comes to exploiting the linearity of PMU measurements. With full penetration of PMUs in the future, the rectangular representation is expected to grow in popularity, especially if full observability can be ensured by the sole use of PMU measurements.

An alternative approach to including PMU data is through sequential SE, which entails two steps. The WLS-based SE is performed first based only on legacy measurements. This state estimates serve the role of linear “pseudo-measurements” for the subsequent step together with PMU measurements. The post-processing involves linear models only, and is efficiently computable. Clearly, sequential SE requires no modifications of existing SE modules, but loses the optimality offered by joint estimation. More severely, if the traditional SE based only on legacy measurements fails to converge to a global optimum, the post-processing including PMU data is unlikely to improve estimation accuracy.

Since both of these options for including PMU data suffer from the nonconvexity present with legacy measurements, the SDR technique is again well motivated to convexify the augmented SE to

$\begin{matrix} {\left\{ {\hat{X},\hat{\chi}} \right\}:={{\arg\;{\min\limits_{X,\chi}{w^{T}\chi}}} + {\sum\limits_{m \in \;{??}}{\omega_{m}\left\lbrack {{{Tr}\left( {\Phi_{m}^{\mathcal{H}}\Phi_{m}V} \right)} - {2{{Re}\left( {\zeta_{m}^{\mathcal{H}}\Phi_{m}v} \right)}}} \right\rbrack}}}} & \left( {16a} \right) \\ {\mspace{79mu}{{{{s.{to}}\mspace{14mu} X} = {\begin{bmatrix} V & v \\ v^{\mathcal{H}} & 1 \end{bmatrix} \succcurlyeq 0}},}} & \left( {16b} \right) \\ {\mspace{79mu}{\begin{bmatrix} {- \chi_{l}} & {z_{l} - {{Tr}\left( {H_{l}V} \right)}} \\ {z_{l} - {{Tr}\left( {H_{l}V} \right)}} & {- 1} \end{bmatrix} \preceq {0{\forall{l.}}}}} & \left( {16c} \right) \end{matrix}$ Similar to the SDR-based SE problem (10) with the additional constraint rank(X)=1, the positive semi-definiteness of X can ensure V=

Substituting the latter into (16), and following the proof of Proposition 1 leads to the equivalence of rank-constrained (16) with the augmented WLS in (15). The SDP problem here also offers the advantages of (10), in terms of polynomial complexity and dual equivalence. From the solution {circumflex over (X)}, either eigenvector approximation or randomization can be employed to generate vectors of length N+1. Using the first N entries of any such vector, a feasible state estimate can be formed by proper rescaling. With linear PMU measurements, the voltage angle ambiguity is no longer present, and the rescaling factor ĉ* can be found by solving

$\begin{matrix} {{\hat{c}}^{*} = {{\arg\;{\min\limits_{c \in C}{\sum\limits_{l = 1}^{L}{\omega_{l}\left\lbrack {z_{l} - {{c}^{2}v^{\mathcal{H}}H_{l}v}} \right\rbrack}^{2}}}} + {\sum\limits_{m \in \;{??}}{\omega_{m}{{{\zeta_{m} - {c\;\Phi_{m}v}}}_{2}^{2}.}}}}} & (17) \end{matrix}$ This is a fourth-order polynomial minimization problem and numerically solvable. It is often the case that the PMUs have much higher accuracy compared to legacy meters, and their corresponding weights are much larger, which leads to ω_(l)>>w_(l). Hence, it suffices to minimize only the dominant second summand in (17), and efficiently approximate the solution

${\hat{c}}^{*} \approx {\left( {\sum\limits_{m \in \;{??}}{\omega_{m}\zeta_{m}^{\mathcal{H}}\;\Phi_{m}v}} \right)/{\left( {\sum\limits_{m \in \;{??}}{\omega_{m}{\;{\Phi_{m}v}}_{2}^{2}}} \right).}}$ This approximation will be used in the numerical tests of the ensuing section.

Numerical Tests

FIG. 2 illustrates an IEEE 30-bus system commonly used for power system testing. The SDR-based SE techniques described herein were tested using the IEEE 30-bus system with 41 transmission lines, and compared to existing WLS methods that are based on Gauss-Newton iterations. Different legacy meter or PMU placements and variable levels of voltage angles were considered. The software toolbox MATPOWER was used to generate the pertinent power flow and meter measurements. In addition, its SE function was adapted to realize the WLS Gauss-Newton iterations. The iterations terminated either upon convergence, or, once the condition number of the approximate linearization exceeds 10⁸, which flags divergence of the iterates. To solve the (augmented) SDR-based SE problems, the MATLAB-based optimization modeling package CVX was used, together with the interior-point method solver SeDuMi.

Test Case 1:

The real and reactive power flows along all 41 lines were measured, together with voltage magnitudes at 30 buses. Independent Gaussian noise corrupts all measurements, with σ_(l) equal to 0.02 at power meters, and 0.01 at voltage meters. Except for the reference bus phasor V_(ref)=1, each bus had its voltage magnitude Gaussian distributed with mean 1 and variance 0.01, and its voltage angle uniformly distributed over [−θ, θ]. For three choices, namely θ=0.3π, 0.4π, and 0.5π, the empirical estimation errors ∥v−{circumflex over (v)}∥₂ were averaged over 500 Monte-Carlo realizations for the SDR approach and the WLS one using three different initializations, as listed in Table I. The percentage of realizations that the iterative WLS method converges is also given in parentheses. The SDR estimator was recovered from the SDR-based SE solution {circumflex over (V)}, by picking the minimum-cost vector over the eigenvector solution and 50 randomization samples. The first WLS estimator, termed WLS/FVP, corresponds to the WLS solution initialized by the flat-voltage profile (FVP) point; that is, the one using the all-one vector as an initial guess. For a better starting point, the second WLS/DC one was obtained by initializing the voltage angles using the DC model SE, and the magnitudes using the corresponding meter measurements. To gauge the SDR approach's near-optimal performance with respect to the global solution, the SDR estimator was further used to initialize the WLS iterations, and the abbreviation used for this estimator is WLS/SDR.

TABLE I ESTIMATION ERROR WITH % OF CONVERGENCE FOR TEST CASE 1. θ SDR WLS/FVP WLS/DC WLS/SDR 0.3π 0.070 0.097 (98.6%) 0.042 (100%) 0.042 (100%) 0.4π 0.081 0.593 (88.6%) 0.255 (97.2%) 0.044 (100%) 0.5π 0.088 2.228 (68.6%) 1.161 (88.0%) 0.047 (100%)

Table I clearly shows that the DC model based SE provides a much better initialization compared to the FVP one, in terms of smaller estimation error and higher probability of convergence. When the actual voltage angles are small (θ=0.3π), the WLS linear approximation is quite accurate with either the FVP or the DC model based initialization, and thus convergence to the global optimum can be guaranteed. Especially for the WLS/DC with θ=0.3π, the empirical error 0.042 can be considered as the benchmark estimation error achieved for such meter placements and noise levels. As θ increases however, the nonlinearity in the measurement model is responsible for the performance degradation exhibited by the WLS/FVP and WLS/DC estimators. Interestingly, estimation accuracy of the SDR estimator is still competitive to the benchmark and comes close to the global optimum. With any choice of θ, the WLS/SDR estimator is always convergent and attains the benchmark accuracy 0.042 within numerical accuracy. This suggests that the SDR-based estimator comes with numerically verifiable approximation bounds relative to the global optimum. Further evidence to this effect is provided by the empirical voltage angle and magnitude errors per bus.

FIGS. 3A1, 3A2, 3B1 and 3B2 compare estimation errors in voltage magnitudes and angles between SDR and WLS solvers at different buses for Test Case 1. With θ=0.3π, FIGS. 3A1 and 3A2 demonstrate that the SDR estimator exhibits error variation similar to both WLS/DC and WLS/SDR, which is roughly twice that of these two optimal schemes. However, as θ increases to 0.4π, FIGS. 3B1 and 3B2 illustrate that the WLS/DC estimator blows up due to possible divergence especially in the angle estimates, while both the SDR and WLS/SDR show comparable accuracy as well as analogous performance. This test case numerically supports the near-optimal performance of the proposed SDR-based SE algorithm.

Test Case 2:

Here 19 line flow meters and 15 bus injection meters are used, together with 30 voltage magnitude meters. Although full observability is ensured, a certain number of lines is not directly observed. Thus, quadratic coupling of measurements affects SE in those indirectly observed lines and leads to performance degradation, as confirmed by Table II. The relative performance and convergence probability among different estimators for various choices of θ follow the trends of Test Case 1, but the placement here yields a larger benchmark estimation error around 0.11. As a result, the impact of initialization is more significant here, as for θ=0.3π the WLS/DC iterations diverge in nearly 10% of realizations.

FIGS. 4A1, 4A2, 4B1 and 4B2 are graphs that compare estimation errors in voltage magnitudes and angles between SDR and WLS solvers for Test Case 2. A close look at the error plots in FIGS. 4B1 and 4B2 reveal that the estimation errors at buses 1 through 8 are more or less similar to the optimal ones, especially for the angle errors. Hence, divergence of the WLS/DC estimator due to insufficient direct flow measurements affects the estimates at buses 9 through 30. Nonetheless, the SDR-based estimator still offers near-optimal performance relative to the benchmark WLS/SDR one for any θ.

TABLE II ESTIMATION ERROR WITH % OF CONVERGENCE FOR TEST CASE 2. θ SDR WLS/FVP WLS/DC WLS/SDR 0.2π 0.174 0.265 (96.6%) 0.148 (99.4%) 0.115 (100%) 0.3π 0.203 1.759 (68.6%) 0.653 (91.4%) 0.109 (100%) 0.4π 0.247 3.521 (47.0%) 2.141 (66.0%) 0.104 (100%)

Test Case 3:

To tackle the issue of insufficient direct measurements in Test Case 2, PMUs are deployed to enhance the SE performance offered by legacy measurements with θ=0.4π. The PMU meter noise level is set to {circumflex over (σ)}_(m)=0.002 at all buses. The greedy approach using the A-optimal placement of PMUs selects the four buses from {10, 12, 27, 15} to be equipped with PMUs sequentially. Since the WLS iterations with only legacy measurements are not guaranteed to converge, as verified by Table II, the sequential approach of including PMU data does not lead to improved convergence. Hence, the joint WLS-based SE approach using polar representation of the state is adopted for comparison. The WLS initialization combines the linear estimates at those observable buses based on the PMU measurements, with the DC model angle estimates mentioned earlier. The SDR estimator is obtained from the solution {circumflex over (X)} in (16) with the approximate resealing factor ĉ* based only on PMU measurements, which also serves as initial guess to obtain the WLS/SDR estimator. The empirical estimation errors for 0 to 4 PMUs are listed in Table III, where the PMU absent results are repeated from Table II. As the number of PMUs increases, the estimation accuracy as well as the probability of convergence improve for the WLS estimator. Still, there is a considerable gap relative to the other two estimators based on the SDR solution. Using more PMUs the SDR estimator approximates better the optimal WLS/SDR one, with the approximation gap coming very close to 1. This is illustrated in FIGS. 5A and 5B, which are graphs that compare estimation errors between SDR and WLS solvers versus the number of PMUs for angle estimates and magnitude estimates. In FIGS. 5A and 5B, empirical angle and magnitude errors in the logarithmic scale are averaged over all 30 buses and plotted versus the number of PMUs deployed. The difference between the SDR and WLS/SDR estimators strictly diminishes as the number of PMUs increases, which suggests that the approximation accuracy of the SDR approach relative to the globally optimum one can be markedly aided by the use of PMU data.

TABLE III ESTIMATION ERROR WITH % OF CONVERGENCE FOR TEST CASE 3. # of PMU SDR WLS WLS/SDR 0 0.247 2.141 (66.0%) 0.104 (100%) 1 0.122 0.678 (94.4%) 0.063 (100%) 2 0.062 0.335 (96.6%) 0.040 (100%) 3 0.036 0.280 (98.8%) 0.025 (100%) 4 0.019 0.061 (99.6%) 0.015 (100%)

Test Case 4:

To investigate the performance and scalability of the proposed algorithms, the IEEE 57- and 118-bus systems were tested extensively under scenarios similar to Test Cases 1-3 for the 30-bus system. For example, Table III lists the estimation error performance based on the 118-bus system with all real and reactive power flow measurements along all 186 lines. Other simulation settings follow exactly Test Case 1. Clearly, the results here again confirm the near-optimal error performance of the proposed SDR-SE approach, while the Gauss-Newton iterative method suffers progressively worse in performance as divergence increases with the system size, especially if good initializations are not available (Θ=0.5π). Additional numerical tests verifying the near-optimal performance and the capability to include PMU data demonstrated results identical to those of Test Cases 1-3, and hence are omitted due to page limitations.

Instead, the computational costs for all three systems are compared in Table IV. Both algorithms are run using the MATLAB R2011a software, on a typical Windows XP computer with a 2.8 GHz CPU. The SDR-based solver takes reasonably more time (around 20 seconds for the 118-bus system), which scales gracefully with the system size. The WLS iterations incur increasing computational time mainly due to the higher divergence rate in larger systems. Notice that here the general-purpose SeDuMi is used to solve (10), without exploiting its special problem structure. Much more efficient solvers are expected with distributed and parallel computational modules, as well as sparsity-exploiting convex optimization techniques, which are currently pursued.

TABLE IV AVERAGE RUNING TIMES IN SECONDS. # of buses WLS SDR 30 0.216 1.62 57 0.558 4.32 118 2.87 21.6

Novel SDR-based SE schemes were disclosed for power system monitoring, by tactfully reformulating the nonlinear relationship between legacy meter measurements and complex bus voltages. The nonconvex SE problem was relaxed to a convex SDP one, and thus rendered efficiently solvable via existing interior-point methods. In addition, the convex dual SE problem was formulated to provide insights on the dual equivalence and guidelines to reduce complexity. To account for recent developments in PMU technology, linear state measurements were also incorporated to enhance the proposed SDR-based SE framework. Extensive numerical tests on the 30-bus benchmark system demonstrated the near-optimal performance of the novel approaches.

Further enhancements to the SDR-based SE framework are described below with respect to multi-area distributed counterparts and tailored solvers exploiting the SDP structure. Robust techniques are also described under a cyber-security context to account for malicious attacks or outliers.

In some example, energy management systems may be combined to form a larger network of interconnected power grids. In accordance with the present disclosure, one or more energy management systems may each use SDR to process a linear estimate of the current state of each bus in its respective power system. The energy management systems may further communicate these state estimates to one another, or to a central energy management system. The central energy management system may process the received estimates to compute a global solution of estimates of electrical characteristics for the entire power grid. That is, state estimates of all buses in each of the interconnected power grids.

FIG. 6 shows another example network 42 of multiple interconnected electrical power systems 40A-40P (power systems 40). In one example, each of power systems 40 contains a number of electrical power buses as well as an energy management system within its respective service area. The energy management systems of power systems 40 may communicate state estimates of their respective power systems to one another. In other examples, each energy management system of power systems 40 may communicate state estimates of their respective power systems to a central energy management system. This system may then process the estimates to determine a global solution providing state estimates of all power buses in each of power systems 40.

An energy management system may be programmed to take certain action in response to computing estimates for the electrical characteristic for all of the buses within the power grid. For example, the energy management system may control operation of one or more of the electrical buses based on the computed to estimates. As another example, the energy management system may generate reports that specify power consumption at each of the buses within the power grid based on the computed estimates. The energy management system may output communications to automatically update an accounting system to charge respective operators of each power buses based on the computed estimates. The energy management system may generate an alert based on the computed estimates.

Robust Power System State Estimation for the Nonlinear AC Flow Model

The electric power grid is a complex cyber-physical system consisting of multiple modules, each with a transmission infrastructure spanning over a huge geographical area, transporting energy from generation sites to distribution networks. Monitoring the operational conditions of grid transmission networks is of paramount importance to facilitate system control and optimization tasks, including security analysis and economic dispatch with security constraints. An important monitoring task for power systems is accurate estimation of the system operation state.

For this purpose, various system variables are measured in distant buses and then transmitted to the control center for estimating the system state variables, namely complex bus voltages. Due to the wide spread of transmission networks and the current integration of enhanced computer/communication infrastructure, the power system state estimation (SE) is challenged by data integrity concerns arising due to “anomalous” measurements affected by outliers and/or adversarial cyber-attacks. Under the nonlinear AC power flow model, the state estimation (SE) problem is inherently nonconvex giving rise to many local optima. In addition to nonconvexity, SE is challenged by data integrity and cyber-security issues. For the AC power flow model however, SE challenges come not only from anomalous data, but are further magnified due to the nonlinear couplings present between meter measurements and state variables. These concerns motivate the development of robust approaches described herein to improve resilience of SE to anomalous (a.k.a. bad) data.

Robust techniques are described that are resilient to outlying measurements and/or adversarial cyber-attacks. In one example implementation, an overcomplete additive outlier-aware measurement model is adopted, and the sparsity of outliers is leveraged to develop a robust state estimation (R-SE) approach to jointly estimate system states and identify the outliers present. Observability and identifiability issues of this model are investigated, and links are established between R-SE and error control coding. The convex semidefinite relaxation (SDR) technique is further pursued to render the nonconvex R-SE problem efficiently solvable. The resultant algorithm markedly outperforms existing iterative alternatives, as corroborated through numerical tests on the standard IEEE 30-bus system.

Referring again to FIG. 1, in some instances, unobservable cyber-attacks to compromise measurements provided by one or more of sensors 14 may fail to be detected by a system operator. A unifying framework is described herein to understand how tolerant the nonlinear regression model is to data corruption, by introducing the notion of measurement distance. The latter is nicely connected to distance metrics popular in channel coding theory, which are known to determine the error-control capability of channel codes. This connection reveals why the measurement distance is useful to characterize the regression function's resilience to outliers.

In addition, the novel R-SE framework described herein lends itself to a convex relaxation approach, which yields R-SE solvers approximating the global optimum. Semidefinite relaxation (SDR) solvers thus emerge as powerful schemes for R-SE of nonlinear AC power flow models. Preliminary tests on the IEEE 30-bus system, as described below, corroborate the performance improvement of the proposed approach. With respect to FIG. 1, consider again a power transmission network 12 with N buses 14 denoted by the set of nodes

:={1, . . . , N}, and L transmission lines 15 represented by the set of edges

:={(n,m)}

×

Suppose M measurements are taken for estimating the complex voltage states {V_(n)}_(n∈)

from a subset of the following system variables:

-   -   P_(n)(Q_(n)): the real (reactive) power injection at bus n         (negative if bus n is connected to a load);     -   P_(mn)(Q_(mn)): the real (reactive) power flow from bus m to bus         n; and     -   |V_(n)|: the voltage magnitude at bus n.

Compliant with the AC power flow model, these measurements obey nonlinear equations relating them with the system state vector v:=[V₁, . . . , V_(N)]^(T)∈C^(N). These equations also involve the injected currents of all buses that are here collected in the vector i:=[I₁, . . . , I_(N)]^(T)∈C^(N), as well as the currents, flowing from say bus m to n, denoted by I_(mn). Kirchoff's law in vector-matrix form simply dictates i=Yv, where Y∈C^(N×N) denotes the grid's symmetric bus admittance matrix having (m,n)-th entry given by:

$\begin{matrix} {Y_{m\; n}:=\left\{ \begin{matrix} {{- y_{m\; n}};} & {{{if}\mspace{14mu}\left( {m;n} \right)} \in} \\ {{y_{nm} + {\sum\limits_{v \in {??}_{n}}y_{n\; v}}};} & {{{if}\mspace{14mu} m} = n} \\ {0;} & {otherwise} \end{matrix} \right.} & \left( {1.A} \right) \end{matrix}$ with y_(mn) denoting the line admittance between buses m and n; y_(nn) bus n's admittance to the ground; and

the set of all buses linked to bus n through transmission lines. In addition, the current flow is given by I_(mn)=y _(mn)V_(m)+y_(mn)(V_(m)−V_(n)), with y _(mn) standing for the shunt admittance at bus m associated with line (m,n). Clearly, all current variables are linearly related to the state v. As for the nonlinear measurements, the AC power flow model asserts that the apparent power injection into bus n is given by P_(n)+jQ_(n)=V_(n)

while the apparent power flow from bus m to bus n by P_(mn)+jQ_(mn)=V_(m)

. Further, expressing the squared bus voltage magnitude as |V_(n)|²=V_(n)

, it is clear that all measurable quantities listed earlier are nonlinearly (in fact quadratically) related to v.

Apart from the nonlinearity present, another challenge present in the SE is due to grossly corrupted meter measurements (a.k.a. bad data). Statistical tests such as the largest normalized residuals of the weighted least-squares (WLS) estimation error are typically employed to reveal and remove bad data. Alternatively, robust estimators, such as the least-absolute deviation, or Huber's M-estimators have also been considered. The robust SE (R-SE) techniques described herein make use of an overcomplete model for the outlying data. To this end, collect first the M measurements in the vector z:=[{

P_(n)}_(n∈)

, {

Q_(n)}_(n∈)

, {

P_(mn)}_((m,n)∈)

, {

Q_(mn)}_((m,n)∈)

, {

V_(n)|²}_(n∈)

]^(T), where the check mark differentiates measured values from the noise-free variables. For consistency with other measurements, |V_(n)|² is considered from now on. This is possible by adopting

V_(n)|=|V_(n)|+ε_(V), where ε_(V) is zero-mean Gaussian with small variance σ_(V) ², to obtain the approximate model

V_(n)|²≈|V_(n)|²+ε_(V)′, where ε_(V)′ has variance 4

V_(n)|²σ_(V) ². Consider also the scalar variables {a_(l)}_(l=1) ^(M) one per measurement, taking the value a_(l)=0 if the l-th measurement obeys the nominal (outlier-free) model, and a_(l)≠0 if it corresponds to a bad datum. This way, the nonlinear measurement model becomes: z _(l) =h _(l)(v)+ε_(l) +a _(l) , l=1, . . . ,M  (2.A) where h_(l)(•) captures the quadratic relationship specified by the aforementioned AC power flow equations, and the zero-mean additive Gaussian white noise (AWGN) ε_(l) is assumed uncorrelated across meters with variance σ_(l) ².

Recovering both v and the M×1 vector a:=[a₁, . . . , a_(M)]^(T) essentially reveals the state and identifies faulty measurements. In this way, the vector a of the model is a state vector that indicates an untrustworthiness of the measurements and identifies faulty measurements. In this example implementation, zero indicates trustworthy and a non-zero may indicates the measurement is not trustworthy. By having an indication of the trustworthiness, the techniques can applying semidefinite relaxation to compute a solution for state of the power buses while eliminating outliers, such as bad data due to communication error or data that has been compromised due to attack. In other words, the techniques are able to construct and resolve the state estimation in a manner that eliminates any meter measurements that are significantly different from the other meters than the techniques can identify and eliminate the outlier.

The system in (2.A) with both v and a being unknown is under-determined, as the number of measurements M is always less than the number of unknowns N+M. Instrumental to handling this under-determinacy will be the (arguably low) percentage of outliers, which gives rise to a (high) level of sparsity, that is the number of zero entries in a. The degree of sparsity will be further linked in the ensuing section with the notions of observability and identifiability of the outlier vector. By capitalizing on the sparsity of a, the goal of jointly estimating and identifying v and a can be achieved by the following outlier-sparsity-controlling criterion:

$\begin{matrix} {\left\{ {}^{\hat{}}{v_{,}^{\hat{}}a} \right\}:={{\arg\;{\min_{v,a}{\sum\limits_{l = 1}^{M}{w_{l}\left\lbrack {z_{l} - {h_{l}(v)} - a_{l}} \right\rbrack}^{2}}}} + {\lambda{a}_{0}}}} & \left( {3.A} \right) \end{matrix}$ where w_(l):=1/σ_(l) ²∀l, and λ>0 scales the regularization term which comprises the l₀-pseudonorm, i.e., the number of non-zero a_(l)'s that naturally controls the number of outliers in^a. In this way, the coefficient λ provides a configurable parameter for outlier control. The coefficient λ can be adjusted to control level of trustworthiness to be applied to the data. The coefficient λ can be selected in a variety of ways, such as by an administrator based on statistical test, prior knowledge or data.

Even with linear models, solving the optimization problem in (3.A) is NP-hard due to the l₀-norm regularization. Before proposing efficient schemes for solving the under-determined problem in (3.A), the next section will provide observability and identifiability analysis to assess the ability of R-SE to cope with sparse outlier patterns.

Outlier Observability and Identifiability

The goal of this section is to investigate fundamental uniqueness issues associated with the system under-determinacy arising due to the overcomplete outlier-aware model in (2.A). To isolate uniqueness from noise resilience issues, focus is placed on the noise-free outlier-aware measurement model written in vector form as: z=h(v)+a  (4.A) with the high-dimensional function h(•):C^(N)→R^(M). Definition 1

Given measurements z=h(v_(o))+a_(o′) with v_(o) denoting the true state, and h(•) known, the outlier vector a_(o) is observable if and only if (iff) ∀v_(o) the set V:={v∈C ^(N) |z=h(v _(o))+a _(o) =h(v)}  (5.A) is empty. Furthermore, the outlier vector a_(o) is identifiable iff ∀v_(o) the set S:={(v,a)|h(v)+a=z, |a| ₀ ≦|a _(o)|₀}  (6.A) has only one element, namely (v_(o),a_(o)).

Outlier observability and identifiability are important. For an observable a_(o), upon collecting z, the system operator can discern whether there are bad data or not. In addition, for an identifiable a_(o), the system operator can recover exactly (in the absence of nominal noise) both a_(o) and v_(o) in the presence of bad data.

Definition 1 implies that if a_(o) is identifiable, then it is necessarily observable, because otherwise the set

in (5.A) would have at least one element v′∈C^(N); in which case, the pair (v′,0) would be an additional second element of S in (6.A)—a fact contradicting identifiability. Therefore, as a property of an outlier vector a_(o) identifiability is stronger than (i.e., subsumes) its observability.

Without accounting for the nominal AWGN in (2.A), it is possible to reduce the cost in (3) to only the l₀-norm, while including the quadratic part as equality constraint to obtain

$\begin{matrix} \begin{matrix} {\left\{ {}^{\hat{}}{v_{,}^{\hat{}}a} \right\}:={\arg\;{\min_{{{h{(v)}} + a} = z}{a}_{0}}}} \\ {= {\arg\;{\min_{v,{a = {z - {h{(v)}}}}}{{{z - {h(v)}}}_{0}.}}}} \end{matrix} & \left( {7.A} \right) \end{matrix}$

Clearly, for the noise-free R-SE problem in (7.A), the pair (v_(o),a_(o)) is feasible, and the cost evaluated at (v_(o),a_(o)) equals |z−h(v_(o))|₀=|a_(o)|₀. This is also the minimum cost attainable when a_(o) is identifiable, as there is no other pair (v,a) with smaller l₀-norm |a|₀ according to Definition 1. Conversely, if the noise-free problem (7.A) has a unique solution given by (v_(o),a_(o)), then a_(o) is identifiable. Similarly, the noise-free R-SE formulation can easily detect the presence of bad data if the minimum achievable is non-zero. This clearly demonstrates the role of the outlier vector's l₀-norm in the R-SE criterion (3.A), in identifying the presence of bad data, or, in recovering the true state even when bad data are present.

A critical attribute for an observable (identifiable) outlier vector is its maximum sparsity level K_(o) (respectively K_(i)). To appreciate this, consider the two broad classes that outliers typically come from. The first class includes bad data emerging due to faulty meters, telemetry errors, or software bugs, which generally occur rarely, that is with low probability and references therein. Here, K_(o) quantifies the maximum number of bad data that can be revealed with high probability; while K_(i) denotes the maximum number of outlying meters that can be identified so that recovery of the true state becomes feasible. The second source of outliers comprises malicious data attacks, in which the adversary can typically control only a subset of meters with limited cardinality. In this class of outliers, K_(o) and K_(i) can suggest the minimum number of meters that must be protected to render malicious data attacks ineffective.

Even though K_(o) (K_(i)) is useful for assessing the degree of outlier observability (identifiability), deciding whether a given vector a_(o) is observable or identifiable for the nonlinear AC model (4.A) is challenging, except for the trivial case a_(o)=0. Fortunately, it is possible to obtain K_(o) and K_(i) leveraging the notion of the measurement distance for any nonlinear function h(•), as defined next.

Definition 2

The measurement distance for the function h(•):C^(N)→R^(M) is given by

$\begin{matrix} \begin{matrix} {{D(h)}:={\min_{v \neq v^{\prime}}{{{h(v)} - {h\left( v^{\prime} \right)}}}_{0}}} \\ {= {\min_{v \neq v^{\prime}}{\sum\limits_{l = 1}^{M}{11\left\lbrack {{h_{l}(v)} - {h_{l}\left( v^{\prime} \right)}} \right\rbrack}}}} \end{matrix} & \left( {8.A} \right) \end{matrix}$ where 11 denotes the indicator function.

The notion of measurement distance parallels that of the Hamming distance in channel coding theory. Given any linear mapping over a known finite field, the Hamming distance characterizes the minimum difference between any two strings that lie in the mapped space, and it can be easily computed for fixed problem dimensions. However, for the R-SE problem of interest, v is drawn from the complex field C^(N), while the mapping h(•) is quadratic. Compared to the Hamming distance it will be generally very challenging to compute D(h) in (8.A).

Interestingly, as the Hamming distance has been popular due to its connection with the error control capability of linear channel codes, the measurement distance in (8.A) will turn out to be particularly handy in characterizing outlier observability and identifiability, as asserted in the following proposition.

Proposition 1

Given the measurement distance D for the nonlinear function h(•) in (8.A), the maximum sparsity level of an observable outlier vector is K_(o)=D−1, while the maximum one of an identifiable outlier vector is

$K_{i} = {\left\lfloor \frac{D - 1}{2} \right\rfloor.}$

The proof for both statements follows readily from Definition 1.A using simple contradiction arguments, and for this reason it is omitted. Notice that the second part can also be deduced, which neither explicitly relates to the notion of measurement distance, nor it is linked with the maximum sparsity level of observable outliers.

Using the measurement distance metric, Proposition 1.A provides a unifying framework to understand the tolerance of any function h(•) to the number of outlying data. Since the measurement distance of any nonlinear function is difficult to obtain, the ensuing subsection pursues linearized approximants of the quadratic measurement model, which are typically employed by Gauss-Newton iterative SE solvers, and can be used to provide surrogate distance metrics. Depending on initialization, the linear approximants could not only be very accurate, but will also shed light on understanding uniqueness issues associated with nonlinear AC power system models.

Linear Approximation Model

Consider linearizing the nonlinear measurement model (4.A) expressed in terms of the polar coordinates of the state vector. Toward this end, the N×1 complex vector v is mapped first to the 2N×1 real vector x:=[|V₁|, . . . , |V_(N)|,

V₁, . . . ,

V_(N)]∈R^(2N). Invoking the first-order Taylor expansion, the noise-free z can be approximated around a given point v, or the corresponding x, by z=h(v)+a≈h

v)+H _(x)(x

x)+a  (9.A) where H_(x)∈R^(M×(2N)) denotes the Jacobian matrix evaluated at x. Upon defining{tilde over ( )}z:=z−h

v)+H _(x) x, the approximate model (9.A) becomes a linear one in the unknown x, that is {tilde over ( )}z≈H _(x) x+a.  (10.A) The measurement distance of the linear function in (10.A) can be found easily, as summarized next. Proposition 2.A

The measurement distance for any linear mapping characterized by a full column-rank matrix H_(x)∈R^(M×(2N)) is D=M+1−rank(H_(x)).

The proof relies on simple linear algebra arguments as follows. Using Definition 2.A, the measurement distance D:=min_(x−x′≠0)|H_(x)(x−x′)|₀ is attained when matrix H_(x) has at most (M−D) linearly dependent rows; otherwise, the number of zero entries of H_(x)(x−x′) would be (M−D+1) and that of non-zero ones (D−1), which leads to a contradiction; hence, rank (H_(x))=M−D+1, as asserted by Proposition 2.A.

Recalling from Proposition 1.A how D is linked with the outlier observability and identifiability levels, the next corollary follows readily.

Corollary 1

For any linear mapping characterized by H _(x) , the maximum sparsity level of an observable outlier is K_(o)=M−rank(H _(x) ), while the maximum sparsity level of an identifiable outlier is

$K_{i} = {\left\lfloor \frac{M - {{rank}\left( H_{\;_{\overset{\_}{x}}\;} \right)}}{2} \right\rfloor.}$

For the linear approximation model in (10.A), the measurement distance D grows linearly with the number of meters M. This demonstrates that measurement redundancy is very beneficial for improving resilience to outliers. Conceivably, D could be further boosted thanks to the nonlinearity in h(•). Compared to its linear counterpart, the quadratic function h(•) is likely to increase the dimension of the space that is mapped to, and thus lead to a larger measurement distance in a space of higher dimensionality. This is precisely the reason why highly nonlinear functions find important applications to cryptography. Although linearization provides a viable approximant, quantifying (or bounding) the measurement distance for the quadratic h(•) corresponding to the AC power flow model constitutes an interesting future research direction.

Solving the R-SE via SDR

This section will leverage convex relaxation techniques to solve the R-SE problem in (3.A). First, building on the premise of compressive sampling, the l₁-norm can be employed to tackle the NP-hard l₀-norm and relax the R-SE cost in (3.A) to

$\begin{matrix} {\left\{ {}^{\hat{}}{v_{,}^{\hat{}}a} \right\}:{{\arg\;{\min_{v,a}{\sum\limits_{l = 1}^{M}{w_{l}\left\lbrack {z_{l} - {h_{l}(v)} - a_{l}} \right\rbrack}^{2}}}} + {\lambda{{a}_{1}.}}}} & \left( {11.A} \right) \end{matrix}$

One goal of the techniques described herein is to develop an R-SE solver capable of accounting for the practical AC quadratic measurement model, while attaining or approximating the global optimum at polynomial-time complexity.

This task will be pursued here using semidefinite relaxation (SDR), which is a powerful technique for convexifying the SE with nonlinear measurement models. To this end, each quadratic measurement z_(l) will be expressed linearly in terms of the outer-product matrix V:=vv^(H). Let {e_(n)}_(n=1) ^(N) denote the canonical basis of R^(N), and define the following admittance-related matrices Y _(n) :=e _(n) e _(n) ^(T) Y  (12a.A) Y _(mn):=( y _(mn) +y _(mn))e _(m) e _(m) ^(T) −y _(mn) e _(m) e _(n) ^(T)  (12b.A) and their related Hermitian counterparts

$\begin{matrix} {{H_{P,_{n}}:={\frac{1}{2}\left( {Y_{n} + Y_{n}^{\mathcal{H}}} \right)}},{H_{Q,_{n}}:={\frac{j}{2}\left( {Y_{n} - Y_{n}^{\mathcal{H}}} \right)}}} & \left( {13{a.A}} \right) \\ {{H_{P,_{m\; n}}:={\frac{1}{2}\left( {Y_{m\; n} + Y_{m\; n}^{\mathcal{H}}} \right)}},{H_{Q,_{m\; n}}:={\frac{j}{2}\left( {Y_{m\; n} - Y_{m\; n}^{\mathcal{H}}} \right)}}} & \left( {13{a.A}} \right) \end{matrix}$

Using these definitions, the following lemma is proved to establish a linear model in the complex rank-one matrix V.

-   Lemma 1 All error-free measurement variables are linearly related     with the outer-product V as     P _(n) =Tr(H _(P,) _(n) V), Q _(n) =Tr(H _(Q,n) V)  (14a.A)     P _(mn) =Tr(H _(P,) _(mn) V), Q _(mn) =Tr(H _(Qm,n) V)  (14b.A)     |V _(n)|² =Tr(H _(V,n) V).  (14c.A)     Thus, the measurement z_(l) in (2) can be written as     z _(l) =h _(l)(v)+ε_(l) +a _(l) =Tr(H _(l) V)+ε_(l) +a _(l)  (15A)     where H_(l) is a Hermitian matrix specified in accordance with     (13a)-(13c).

Lemma 1 implies the following equivalent reformulation of (11.A) {{circumflex over (V)} ₁ ,â ₁}:=arg min_(V,) _(a) Σ_(l=1) ^(M) Wl[zl−Tr(H _(l) V)−al] ² +λ∥a∥ ₁  (16a.A) s. to V ε

^(N×N)

0, and rank(V)=1  (16b.A) where the positive semi-definiteness and rank constraints jointly ensure that for any V admissible to (16b.A), there always exists a state vector v∈C^(N) such that V=

.

Albeit the linearity between z_(l) and V in the new formulation (16), nonconvexity is still present in two aspects: i) the cost in (16a.A) has degree 4 wrt the entries of V; and ii) the rank constraint in (16b.A) is nonconvex. Aiming for a semidefinite programming (SDP) formulation of (16.A), Schur's complement lemma can be leveraged to convert the summands in (16a) to a linear cost over an auxiliary vector χ∈R^(M). Specifically, with w:=[w₁, . . . , w_(L)]^(T) and likewise for χ, consider an R-SE reformulation as: {V ₂ ,a ₂,χ₂}:=arg min_(V.a.χ) w ^(T) χ+λ∥a∥ ₁  (17a.A) s. to V

0, and rank(V)=1  (17b.A) [zl−Tr _((H) _(l) _(V)) ^(−χl) −a _(l) ^(z) ^(l) ^(−Tr(H) ^(l) ^(V)) ⁻¹ −a _(l)]

0∀l  (17c.A)

The equivalence among all three R-SE formulations can be asserted as follows.

Proposition 3.A

For the AC power flow model, all three nonconvex formulations in (11.A), (16.A), and (17.A), solve an equivalent R-SE problem. For the optima of these problems, it holds that ^V ₁

V ₂

{circumflex over (v)}

and ^χ_(2,l) =[^z _(l) −Tr(H _({circumflex over (l)}) V ₂)]² ∀l.  (18.A)

Proposition 3.A establishes the relevance of the novel R-SE formulation (17.A), which is still nonconvex though, due to the rank-1 constraint. Fortunately though (17.A) is amenable to the SDR technique, which amounts to dropping the rank constraint and has well-appreciated merits as an optimization tool. The contribution here consists in permeating the benefits of this powerful optimization tool to estimating the state of AC power systems, even when outliers (bad data or cyber-attacks) are present.

In the spirit of SDR, relaxing the rank constraint in (17b.A) leads to the following SDP formulation:

$\begin{matrix} {\left\{ {\hat{V},\hat{a},\hat{\chi}} \right\}:={{\arg\;{\min\limits_{V,a,\chi}\;{w^{T}\chi}}} + {\lambda{a}_{1}}}} & \left( {19a} \right) \\ {{{s.{to}}\mspace{14mu} V} \succcurlyeq 0} & \left( {19b} \right) \\ {\begin{bmatrix} {- \chi_{l}} & {{??}_{l} - {{Tr}\left( {H_{l}V} \right)} - a_{l}} \\ {{??}_{l} - {{Tr}\left( {H_{l}V} \right)} - a_{l}} & {- 1} \end{bmatrix} \preccurlyeq {0\mspace{11mu}{\forall l}}} & \left( {19c} \right) \end{matrix}$

SDR endows R-SE with a convex SDP formulation for which efficient schemes are available to obtain the global optimum using, e.g., the interior-point solver SeDuMi [18]. The worst-case complexity of this SDP solver is O(M⁴ √{square root over (N)} log(1/ε)) for a given solution accuracy ε>0. For typical power networks, M is in the order of N, and thus the worst-case complexity becomes O(N^(4.5) log(1/ε)). Further computational complexity reduction is possible by exploiting the sparsity, and the so-called “chordal” data structure of matrix V.

Nonetheless, the SDP problem (19) is only a relaxed version of the equivalent R-SE in (17); hence, its solution^V may have rank greater than 1, which makes it necessary to recover a feasible estimate^v from^V. This is possible by eigen-decomposing

${\;^{\hat{}}V = {\sum\limits_{i = 1}^{r}{\lambda_{i}u_{i}u_{i}^{\mathcal{H}}}}},$ where r:=rank

V), λ₁≧ . . . ≧λ_(r)>0 denote the positive ordered eigenvalues, and {u_(i)∈C^(N)}_(i=1) ^(r) are the corresponding eigenvectors. Since the best (in the minimum-norm sense) rank-one approximation of^V is λ₁u₁

, the state estimate can be chosen equal to^v(u₁):=√{square root over (λ₁)}u₁.

Besides this eigenvector approach, randomization offers another way to extract an approximate R-SE vector from^V, with quantifiable approximation accuracy. The basic idea is to generate multiple Gaussian distributed random vectors v˜C

(0,^V), and pick the one with the minimum error cost corresponding to the set of inlier meters

:={l|1≦l≦M, â_(l)≠0}. Note that although any vector v is feasible for (11.A), it is still possible to decrease the minimum achievable cost by rescaling to obtain^v(v)=ĉv, where the optimal weight can be chosen as the solution of the following convex problem as

$\begin{matrix} \begin{matrix} {\hat{c} = {\arg\;{\min_{c > 0}{\sum\limits_{l \in \mathcal{M}_{i}}{w_{l}\left( {z_{l} - {c^{2}v^{\mathcal{H}}H_{l}v}} \right)}^{2}}}}} \\ {= \sqrt{\frac{\sum\limits_{l \in \mathcal{M}_{i}}{{??}_{l}{??}_{l}v^{\mathcal{H}}H_{l}v}}{\sum\limits_{l \in \mathcal{M}_{i}}{{??}_{l}\left( {v^{\mathcal{H}}H_{l}v} \right)}^{2}}}} \end{matrix} & \left( {20.A} \right) \end{matrix}$

It will be of interest to find approximation bounds for the SDR-based R-SE approach, or, obtain meaningful conditions under which the relaxed solution coincides with the unrelaxed one. Both problems constitute interesting future directions for analytical research, while the ensuing section will demonstrate the performance improvement possible with the proposed method using numerical tests of practical systems.

Simulation Results

The novel SDR-based R-SE approach was tested using the IEEE 30-bus system with 41 transmission lines, and compared to existing WLS methods that are based on Gauss-Newton iterations. The software toolbox MATPOWER was used to generate the pertinent power flow and meter measurements. In addition, its SE function doSE has been adapted to realize the WLS Gauss-Newton iterations. The iterations terminated either upon convergence, or, once the condition number of the approximate linearization exceeds 10⁸, which flags divergence of the iterates. To solve the SDR-based R-SE problems, the MATLAB-based optimization package CVX was used, together with the interior-point method solver SeDuMi.

The real and reactive power flows along all 41 lines were measured, together with voltage magnitudes at 30 buses. AWGN corrupts all measurements, with σ_(l) equal to 0.02 at power meters, and 0.01 at voltage meters. Except for the reference bus phasor V_(ref)=1, each bus has its voltage magnitude Gaussian distributed with mean 1 and variance 0.01, and its voltage angle uniformly distributed over [−0.5π, 0.5π].

FIGS. 7A and 7B are graphs that compare estimation errors in voltage magnitudes and angles between SDR and WLS solvers at different buses for robust semidefinite estimation. In FIGS. 7A and 7B, the empirical voltage angle and magnitude errors per bus, averaged over 500 Monte-Carlo realizations, are plotted. In each realization, one power flow meter measurement is randomly chosen as a bad datum, after multiplying the meter reading by 1.2. Clearly, the techniques described herein greatly reduces the effects of bad data in the estimation error in voltage phase angles (upper), which may be a more important SE performance metric than the magnitude one.

For the practical nonlinear AC power system model, uniqueness issues and robust state estimation (R-SE) algorithms were identified in this disclosure, when outliers (bad data and/or malicious attacks) are present. Using a sparse overcomplete outlier model, observability and identifiability issues were quantified using the notion of measurement distance for the quadratic measurement model. Valuable insights and computable levels of outlier observability and identifiability were provided herein for linear approximations of the quadratically nonlinear models. A novel SDR-based scheme was also described by tactfully relaxing the nonconvex R-SE problem to a convex SDP one, thus rendering it efficiently solvable via existing interior-point methods. Numerical simulations on the 30-bus benchmark system demonstrated improved performance of the proposed R-SE scheme.

Multi-Area State Estimation Using Distributed SDP for Nonlinear Power Systems

The electric power grid is a complex system consisting of multiple regional subsystems, each with its own transmission infrastructure spanning over a huge geographical area. Being an important power system monitoring task, state estimation (SE) has been traditionally performed at regional control centers with limited interaction. With the deregulation of energy markets however, large amounts of power are transferred among control areas over the tie-lines at high rates. This is part of the system-level institutional changes aiming at an interconnected network with improved reliability. Since each control area can be strongly affected by events and decisions elsewhere, regional operators can no longer operate in a truly independent fashion. At the same time, central processing of the current energy-related tasks faces several limitations: i) vulnerability to unreliable telemetry; ii) high computational complexity at a single control center; and iii) data security and privacy concerns of regional operators.

Reduced-complexity techniques are described in this disclosure for local control areas to solve the centralized SDP-based SE problem in a distributed fashion. That is, techniques are described in which SE is distributed among multiple control areas, possibly for multiple regional operators. The techniques described herein leverage results on positive semidefinite matrix completion to split a global state matrix constraint into local ones, which further allows for parallel implementation using the alternating-direction method of multipliers (ADMM). With minimal data exchanges among neighboring areas, each control center can efficiently perform local updates that scale with each area's size (number of buses). Numerical simulations using the IEEE 14-bus system demonstrate the asymptotic convergence of local state matrices, and desirable estimation accuracy attainable with a limited number of exchanges.

In one example, an SDR-based multi-area SE technique is described that allows the computational complexity per operator to scale with the size of its control area. Solving SDP in a distributed fashion is challenging due to the couplings of local state-related matrices enforced by the positive semidefinite (PSD) constraint of the global state matrix. The graph chordal property is exploited to decompose the global PSD constraint into ones involving local matrices (one per control center). To allow for parallel implementation, equality constraints are enforced over overlapping entries between the local matrices of any pair of neighboring areas. The alternating-direction method of multipliers is further leveraged to develop iterative local solvers attaining the centralized SDR-based SE solutions, at minimal communication overhead.

Centralized SE: an SDR Approach

FIG. 8 illustrates a transmission network 50 with N buses 14 (fourteen in the example of FIG. 8) denoted by the set of nodes

={1, . . . , N}, which consists of K interconnected control areas. Suppose that this set is partitioned as N=U_(k=1) ^(K)N_(k), where

contains the subset of interconnected buses 14 supervised by the k-th control center. FIG. 8, for example, illustrates buses 14 partitioned into four areas (AREA 1-AREA 4). In this example, each of the areas represents a different control area and includes a separate energy management system that applies the techniques described herein to locally compute a estimates for the electrical characteristics for electrical power buses for the respective power sub-grid within the control area.

To estimate the complex voltage V_(n) per bus n (in rectangular coordinates), and the state vector v:=[V₁, . . . , V_(N)]

∈C^(N) as a whole, each area k collects synchronous measurements in the vector z_(k):=[z_(k) ¹, . . . , z_(k) ^(M) ^(k) ]

R^(M) ^(k) . These measurements include real or reactive bus injections, line flows, and bus voltage magnitudes. Adhering to the standard AC power flow model, the power flow along line (n,m) is a quadratic function of only V_(n) and V_(m) at the two end buses, while the injected power to bus n is also a quadratic function of V_(n) and voltages of all neighboring buses. In general, the l-th meter measurement at the k-th area obeys the nonlinear model: z _(k) ^(l) =h _(k) ^(l)(v)+ε_(k) ^(l), 1≦k≦K, 1≦l≦M _(k)  (1.B) where h_(k) ^(l)(•) captures the quadratic dependence of z_(k) ^(l) on v, and ε_(k) ^(l) accounts for the additive measurement noise assumed independent across meters. Without loss of generality, all noise terms are pre-whitened so that each ε_(k) ^(l) has unit variance. With all the measurements from K areas available, the centralized SE under the maximum-likelihood (ML) criterion boils down to the LS one, given by:

$\begin{matrix} {{{}_{}^{}{}_{L\; S}^{}}:={{\arg\;{\min_{v}{\sum\limits_{k = 1}^{K}{\sum\limits_{l = 1}^{M_{k}}\left\lbrack {z_{k}^{l} - {h_{k}^{l}(v)}} \right\rbrack^{2}}}}}:.}} & \left( {2.B} \right) \end{matrix}$

The centralized LS problem in (2.B) can be solved via Gauss-Newton iterations. Such iterations linearize the quadratic measurement functions per iteration, and presume an initial guess reasonably close to the global optimum. Otherwise, state iterates may converge to a local optimum, or, even diverge. However, the increasing volatility of power systems due to the integration of renewable energy sources makes it more challenging to acquire a judicious initial guess. One approach is to handle this sensitivity to initialization using a convex relaxation of the nonlinear LS cost. It relies on the fact that each function h_(k) ^(l)(•) is quadratic with respect to v, but linear in the outer-product V:=

which leads to re-writing (1.B) as z _(k) ^(l) =Tr(H _(k) ^(l) V)+ε_(k) ^(l), 1≦k≦K, 1≦l≦M _(k):  (3.B) where H_(k) ^(l) denotes the N×N matrix that captures the linear relation between z_(k) ^(l) and V according to h_(k) ^(l)(•). This way, the centralized SE problem of solving v amounts to an equivalent one involving its outer-product matrix V, namely {circumflex over (V)} _(LS):=arg min_(V) _(∈C) _(N×N)Σ_(k=1) ^(K)Σ_(l=1) ^(M) ^(k) [z _(k) ^(l) −Tr(H _(k) ^(l) V)]²  (4a.B) s. to V

0, and rank(V)=1  (4a.B)

The PSD constraint along with the non-convex rank constraint guarantee that the obtained^V_(LS) would be a valid outer-product matrix. In fact, the equivalence of the formulations is established as^V_(LS)

v_(LS)^

According to SDR technique, the rank constraint in (4b.B) is relaxed, and reformulate an SDP using Schur's complement lemma, that is

$\begin{matrix} {{\left\{ {\hat{V},\hat{??}} \right\}:={\arg\;{\min_{V,{??}}{\sum\limits_{k = 1}^{K}{\sum\limits_{l = 1}^{M_{k}}{??}_{k}^{l}}}}}}\;} & \left( {5{a.B}} \right) \\ {{{{s.{to}}\mspace{14mu} V} \succcurlyeq 0},} & \left( {5{b.B}} \right) \\ {{\begin{bmatrix} {- {??}_{k}^{l}} & {{??}_{k}^{l} - {{Tr}\left( {H_{k}^{l}V} \right)}} \\ {{??}_{l} - {{Tr}\left( {H_{k}^{l}V} \right)}} & {- 1} \end{bmatrix} \preccurlyeq 0},{\forall k},{l.}} & \left( {5{c.B}} \right) \end{matrix}$

The SDP problem (5.B) allows for general-purpose convex solvers to obtain the solution in polynomial time, while having the potential to attain the global optimum of the centralized SE problem (2.B). However, the worst-case complexity of solving the SDP (5.B) is approximately O(N^(4.5) log(1/ε)) for a given solution accuracy ε>0, while the average complexity can be O(N^(3.5) log(1/ε)). Albeit polynomial, the incurred complexity here may not always scale well with the total number of buses N. Concerns regarding data privacy and intergrity also motivate individual control centers to implement distributed SDP solvers with minimal computational and communication costs.

One possible approach for distributed SDP falls under the consensus-based distributed convex optimization framework. Under this framework, each control center keeps a local copy of the state matrix V in (5.B), and aims to iteratively agree with neighboring control areas. However, this consensus-based approach still requires locally solving an SDP problem of the same size as the centralized one, which is computationally burdensome for individual control centers. The ensuing section will introduce a distributed formulation for solving (5.B), enabling the local computational cost to scale with the number of buses per control center.

Distributed SDR-Based SE

To this end, let

denote the augmented subset of buses involved in all quadratic measurements contained in z_(k). Typically

_((k)), since the latter may include buses from neighboring control areas that are interconnected through tie lines. Taking for example Area 2 in FIG. 8, the subset

needs to contain bus 14 ₅ since there is a line flow meter 16 ₁ in Area 2 that is placed on the tie-line connecting buses 14 ₄ and 14 ₅. Similar reasoning assigns bus 14 ₉ also to

. All the four sets

are indicated by the dotted lassos in FIG. 8. Based on the overlaps among {

_(k))}, define the set of neighboring control areas for the k-th one as

:={j|1≦j≦K,

∩

_((k))≠0}. Hence, Area 2 in FIG. 8 has neighbors in

={1,3}. Let also the vector v_((k)) collect the state variables in

With these notations, it is possible to re-write the measurement models in (1.B) and (3.B) as z _(k) ^(l) =h _((k)) ^(l)(v _((k)))=Tr(H _((k)) ^(l) V _((k)))+ε_(k) ^(l) , ∀k,l  (6.B) where V_((k)):=v_((k))

denotes a submatrix of V formed by extracting rows and columns corresponding to buses in

_((k)); and similarly for H_((k)) ^(l). Due to the overlaps among the subsets {

_((k))}, the outer-product V_((k)) of the k-th area overlaps also with V_((j)), for each of its neighboring areas j∈

. In this way, the neighboring control areas that are interconnected via transmission lines are identified and the relevant state estimates are shared during computation of the solution. Each transmission line that connects a given control area to an electrical bus within a different one of the control areas and that includes a sensor that provides a measurement associated with the transmission line is identified. In each case, intermediate results for the state estimations may be shared between the energy management systems when locally computing a solution to the state estimates. For example, as described, during the computation of the solution, the estimates for electrical bus 14 ₅ from control Area 1 are shared with control Area 2. Similarly, during the computation of the solution, the estimates for electrical bus 14 ₁₄ from control Area 3 are shared with control Area 4, the estimates for electrical bus 14 ₁₁ from control Area 4 are shared with control Area 3, and the estimates for electrical bus 14 ₉ from control Area 3 are shared with control Area 2. As such, for purposes of computation of the state estimates, the subset of the electrical power buses for which state estimates are computed by the energy management system within each of the control areas may be expanded to include electrical busses of other control areas when transmission lines connect the control areas and sensors provides measurements associated with the transmission lines.

By reducing the measurement functions at the k-th area to the submatrix V_((k)), it is further possible to define the LS error cost per area k as

$\begin{matrix} {{f_{k}\left( V_{(k)} \right)}:={\sum\limits_{l = 1}^{M_{k}}\left\lbrack {z_{k}^{l} - {{Tr}\left( {H_{(k)}^{l}V_{(k)}} \right)}} \right\rbrack^{2}}} & \left( {7.B} \right) \end{matrix}$ which only involves the local matrix V_((k)). Hence, the centralized SE problem in (5.B) becomes equivalent to

$\begin{matrix} {{{\,^{\hat{}}V} = {\arg\;{\min_{V}{\sum\limits_{k}{f_{k}\left( V_{(k)} \right)}}}}}{{{s.{to}}\mspace{14mu} V} \succcurlyeq 0.}} & \left( {8.B} \right) \end{matrix}$

This equivalent formulation effectively expresses the overall LS cost as the superposition of local costs in (7.B). Even with such a decomposition of the cost, the main challenge to implement (8.B) in a distributed manner actually lies in the PSD constraint that couples local matrices {V_((k))} which overlap partially. If all submatrices {V_((k))} were non-overlapping, the cost would be decomposable as in (8.B), and the PSD constraint on V would simplify to V_((k))

0 per area k. This equivalence may fail to hold if submatrices partially share some entries of V. Nonetheless, these overlaps are beneficial to percolate information across control areas, and this eventually enables the local estimators to have the global performance.

The idea here is to explore valid network topologies with which such PSD constraint decomposition is feasible. To this end, it may be useful to leverage results on completing partial Hermitian matrices to obtain PSD ones. Upon obtaining the underlying graph formed by the specified entries in the partial Hermitian matrices, these results rely on the so-termed graph “chordal” property to establish the equivalence between the positive semidefiniteness of the overall matrix and that of all submatrices corresponding to the graph's maximal cliques. One example implementation of the techniques described herein for distributed SE make use of the alternating direction method-of-multipliers (ADMM), and can be applied to handle any general error cost f_(k), so long as it is convex in V_(k).

To decompose the PSD constraint into local ones, construct first a graph

over the set of buses

with all its edges corresponding to the entries in {V_((k))}. The graph

amounts to having all buses within each subset

to form a clique. Furthermore, the following is assumed:

-   (as1.B) The graph with all the control areas as nodes and their     edges defined by the neighborhood subset {     }_(k=1) ^(K) forms a tree. -   (as2.B) Each control area has at least one bus that does not overlap     with any neighboring area.

Condition (as1.B) is quite reasonable for the control areas in most transmission networks, which in general are loosely connected over large geographical areas by a small number of tie lines. In addition, under the current meter deployment, the tie lines are not monitored everywhere, and thus it is more likely to have tree-connected control areas when requiring neighboring areas to share a tie-line with meter measurements. In FIG. 8 for instance, Areas 1 and 4 are physically interconnected by the tie line connecting bus 14 ₅-bus 14 ₆. However, there is no measurement on that line, hence the two areas are not considered neighbors and the total four areas eventually form a line array. Moreover, condition (as2.B) easily holds since in practice most of the buses are not connected to any tie line.

Proposition 1.B

Under (as1.B) and (as2.B), the graph

with all its edges corresponding to the entries in {V_((k))} is a chordal graph; meaning, there are no minimal cycles of length greater than 3 in

. Furthermore, all the maximal cliques of

are captured by the subsets {

_((k))}.

Proposition 1.B can be proved by finding contradicting arguments for finding possible minimal cycles of more than 3 nodes. If there exists a minimal cycle by nodes {n_(l)}_(l−1) ⁴ connected in a sequence, it is impossible to have any 3 out of the 4 nodes form a single subset

_((k)); otherwise, these 3 nodes would form a minimal cycle by themselves. In addition, suppose the 4 nodes are from two subsets, that is, nodes {n1,n2}

, while {n3,n4}

∈

. The edge (n2,n3)∈

implies that areas k and j are neighbors, and further either n2 or n3 is a common bus. Let the common bus be n2∈

_((k)), which in turn forms a 3-node cycle with n3 and n4. Applying this argument by contradiction, it is possible to show the chordal property for the graph

. For the second part of Proposition 1.B, all cliques {

_((k))} are maximal since condition (as2.B) excludes the case that one clique may lie within another one.

Given any entry of {V_((k))}, the full PSD matrix V is “completable” if and only if

is chordal, and all its submatrices corresponding to the maximal cliques are all PSD. Hence, satisfying the PSD constraint V

0, is tantamount to enforcing the constraint on all local matrices V_((k))

0, ∀k. With this equivalence, the centralized SDR-based SE problem (8.B) can be re-written as

$\begin{matrix} {{{{\,^{\hat{}}V} = {\arg\;{\min_{V}{\sum\limits_{k}{f_{k}\left( V_{(k)} \right)}}}}}{{{{s.{to}}\mspace{14mu} V_{(k)}} \succcurlyeq 0},{\forall{k.}}}}\;} & \left( {9.B} \right) \end{matrix}$

The constraint decomposition in (9.B) is key for developing distributed solvers. To this end, each control area k solves for its own local V_((k)), denoted by the complex matrix W_(k) of size |

_((k))|×|

_((k))|. For every pair of neighboring areas, say k and j, identify the intersection of their buses as S_(kj). Let also W_(k) ^(j) denote the submatrix extracted from W_(k) with both rows and columns corresponding to S_(kj); and likewise for the submatrix W_(j) ^(k) from W_(j). To formulate the centralized SE problem (9.B) as one involving all local matrices {W_(k)}, it suffices to have additional equality constraints on the overlapping entries, namely

$\begin{matrix} {{{\,\left\{ {}^{\hat{}}W_{k} \right\}} = {\arg\;{\min_{\{ W_{k}\}}{\sum\limits_{\kappa}{f_{\kappa}\left( W_{(\kappa)} \right)}}}}}{{{{s.{to}}\mspace{14mu} W_{k}} \succcurlyeq 0},{\forall k},{W_{k}^{j} = W_{j}^{k}},{\forall{j \in {??}_{k}}},{\forall{k.}}}} & \left( {10.B} \right) \end{matrix}$

The equality constraints of (10.B) enforce neighboring areas to consent on their shared entries, which renders the equivalence between (10 and (9.B) established as^W_(k)

W_((k)), ∀k. Interestingly, this overcomes couplings of the PSD constraint, and will allow for powerful distributed implementation modules to realize multi-area SDR-based SE, as elaborated in the ensuing section.

Alternating Direction Method-of-Multipliers

Clearly, the equality constraints in (10.B) couple the optimization problems across the K areas. To enable a fully distributed solution, two auxiliary matrices denoted by R_(kj) and I_(kj) are introduced per pair of neighboring areas k, j. For notational brevity, symbols R_(kj) and R_(jk) are used interchangeably for the same matrix; and similarly for I_(kj) and I_(jk). Then, (10.B) can be alternatively expressed as

$\begin{matrix} {{{\,\left\{ {}^{\hat{}}W_{k} \right\}} = {\arg\;{\min_{\{{W_{k} \succcurlyeq 0}\}}{\sum\limits_{\kappa}{f_{\kappa}\left( W_{(\kappa)} \right)}}}}}{{{{s.{to}}\mspace{14mu}{\mathcal{R}\left( W_{k}^{j} \right)}} = R_{k\; j}},{\forall{j \in {??}_{k}}},{\forall k},{{I\left( W_{k}^{j} \right)} = I_{k\; j}},{\forall{j \in {??}_{k}}},{\forall{k:.}}}} & \left( {11.B} \right) \end{matrix}$

The goal is to solve (11.B) in a distributed fashion using the alternating direction method of multipliers (ADMM), a method that has been successfully applied to various distributed optimization problems. Let Γ_(kj) and Λ_(kj) denote the Lagrange multipliers associated with the pair of constraints in (11.B). With c>0 denoting a penalty coefficient, consider the augmented Lagrangian function of (11.B) as

$\begin{matrix} {{\mathcal{L}\left( {\left\{ W_{k} \right\},\left\{ R_{k\; j} \right\},\left\{ I_{k\; j} \right\},\left\{ \Gamma_{k\; j} \right\},\left\{ \Lambda_{{k\; j}\;} \right\}} \right)}:={\sum\limits_{k}{\left\{ {{f_{k}\left( W_{k} \right)} + {\sum\limits_{j \in {??}_{k}}{{Tr}\left\lbrack {\Gamma_{k\; j}\left( {{\mathcal{R}\left( W_{k}^{j} \right)} - R_{k\; j}} \right)} \right\rbrack}} + {\sum\limits_{j \in {??}_{k}}{\frac{c}{2}{{{\mathcal{R}\left( W_{k}^{j} \right)} - R_{k\; j}}}_{F}^{2}}} + {\sum\limits_{j \in {??}_{k}}{{Tr}\left\lbrack {\Lambda_{k\; j}\left( {{\mathcal{I}\left( W_{k}^{j} \right)} - I_{k\; j}} \right)} \right\rbrack}} + {\sum\limits_{j \in {??}_{k}}{\frac{c}{2}{{{\mathcal{I}\left( W_{k}^{j} \right)} - I_{k\; j}}}_{F}^{2}}}} \right\}.}}} & \left( {12.B} \right) \end{matrix}$

Letting i denote the iteration index in the superscript, the ADMM operates by minimizing the Lagrangian L in (12.B) cyclically with respect to one set of variables while fixing the rest. Given all the iterates at the i-th iteration, the ADMM steps proceed to the ensuing iteration as follows.

-   [S1] Update the primal variables:     {W _(k) ^(i+1)}:=argmin_(W) _(k)     L({W _(k) },{R _(kj) ^(i) },{I _(kj) ^(i)},{Γ_(kj) ^(i)}, {Λ_(kj)     ^(i)}) -   [S2] Update the auxiliary variables:     {R _(kj) ^(i+1) ,I _(kj) ^(i+1)}:=argmin_(R) _(kj) _(,I) _(kj) L({W     _(k) ^(i+1) }, {R _(kj) },{I _(kj)}, {Γ_(kj) ^(i)}, {Λ_(kj) ^(i)}) -   [S3] Update the multipliers:     Γ_(kj) ^(i+1):=Γ_(kj) ^(i) +c[     (W _(k) ^(j))^(i+1) −R _(kj) ^(i+1)]     Λ_(kj) ^(i+1)=Λ_(kj) ^(i) +c[I(W _(k) ^(j))^(i+1) −I _(kj) ^(i+1)]:

All the variables can be easily initialized to 0. Clearly, the optimization problem in [S1] is decomposable over all K control areas. Moreover, by exploiting the problem structure, the three steps can be simplified as follows.

-   [S1] Update W_(k) ^(i|1) per area k:

$\begin{matrix} {W_{k}^{i + 1}:={{{\arg\;{\min_{W_{k} \succcurlyeq 0}{f_{k}\left( W_{k} \right)}}} + {\sum\limits_{j \in {??}_{k}}{{Tr}\left\lbrack {\Gamma_{k\; j}^{i}{\mathcal{R}\left( W_{k}^{j} \right)}} \right\rbrack}} + {\sum\limits_{j \in {??}_{k}}{{Tr}\left\lbrack {\Lambda_{k\; j}^{i}{\mathcal{I}\left( W_{k}^{j} \right)}} \right\rbrack}} + {\sum\limits_{j \in {??}_{k}}{\frac{c}{2}{{{\mathcal{R}\left( W_{k}^{j} \right)} - R_{k\; j}^{i}}}_{F}^{2}}} + {\sum\limits_{j \in {??}_{k}}{\frac{c}{2}{{{\mathcal{I}\left( W_{k}^{j} \right)} - I_{k\; j}^{i}}}_{F}^{2}}}}:.}} & \left( {13.B} \right) \end{matrix}$

-   [S2] Update the pair of auxiliary variables per area k:

$\begin{matrix} {R_{k\; j}^{i + 1} = {\frac{1}{2}\left\lbrack {{\mathcal{R}\left( W_{k}^{j} \right)}^{i + 1} + {\mathcal{R}\left( W_{j}^{k} \right)}^{i + 1}} \right\rbrack}} & \left( {14.B} \right) \\ {I_{k\; j}^{i + 1} = {{\frac{1}{2}\left\lbrack {{\mathcal{I}\left( W_{k}^{j} \right)}^{i + 1} + {\mathcal{I}\left( W_{j}^{k} \right)}^{i + 1}} \right\rbrack}:.}} & \left( {15.B} \right) \end{matrix}$

-   [S3] Update the pair of multipliers per area k:

$\begin{matrix} {\Gamma_{kj}^{i + 1}:={\Gamma_{kj}^{i} + {\frac{c}{2}\left\lbrack {{\mathcal{R}\left( W_{k}^{j} \right)}^{i + 1} - {\mathcal{R}\left( W_{j}^{k} \right)}^{i + 1}} \right\rbrack}}} & \left( {16.B} \right) \\ {\Lambda_{kj}^{i + 1} = {{\Lambda_{kj}^{i} + {\frac{c}{2}\left\lbrack {{\mathcal{I}\left( W_{k}^{j} \right)}^{i + 1} - {\mathcal{I}\left( W_{j}^{k} \right)}^{i + 1}} \right\rbrack}}:.}} & \left( {17.B} \right) \end{matrix}$

Since the LS error f_(k) is quadratic, the cost in 13.B) is convex in W_(k), and can be formulated as an SDP using Schur's complement lemma. Hence, [S1] amounts to solving local problems that scale with the number of buses controlled by each regional center, which greatly reduces the computational burden as compared to the global SDP problem. In addition, both [S2] and [S3] involve linear iterations and are very efficient. This completes the iterations for the distributed SE solver among multiple areas. A couple of remarks are now in order.

Remark 1.B (Estimation Error Criteria)

Compared to the distributed OPF methods with linear or at most quadratic costs, the proposed distributed SE framework can accommodate more general error cost functions f_(k) for various estimation purposes. For robust estimation purposes, the l₁-norm of estimation errors can be used. A combination of the l₁ and l₂ norms may be used for improved accuracy in joint estimation of the state and extraction of the outliers. Both error criteria will lead to a reformulation of (13.B) in [S1] as a local convex SDP problem.

Remark 2.B (Data Exchange Overhead and Privacy)

At first glance, one may think that in steps [S1]-[S3] per iteration i neighboring control centers need to exchange the submatrices {(W_(k) ^(j))^(i)} related to the common buses, as well as the associated multipliers {Γ_(kj) ^(i)} and {Λ_(kj) ^(i)}. A closer look however, reveals that exchanging common submatrices suffices as the multipliers can be readily updated locally with coordination among neighboring areas, e.g., by initializing all to zero. This suggests considerable reduction in the communication overhead for the ADMM iterations. Furthermore, the proposed scheme neither requires exchanging local measurements nor local network topology. It suffices to only share a small portion of local state matrices. From the data privacy perspective, individual operators enjoy this benefit as a natural bonus of the novel multi-area SE method.

Upon convergence, each control center obtains the iterate W_(k) ^(i) as the estimate of its local matrix V_((k)) of the centralized SDR-based SE problem (9.B). Nonetheless, this SDP problem is only a relaxed version of the original SE in (4.B); hence, its solution may have rank greater than 1, which requires recovering a feasible local state estimate^v_((k)) ^(i) from W_(k) ^(i). This is possible by eigen-decomposing

${W_{k}^{i} = {\sum\limits_{r = 1}^{R}{\lambda_{r}u_{r}u_{r}^{\mathcal{H}}\mspace{14mu}{per}\mspace{14mu}{area}\mspace{14mu} k}}},$ where R:=rank(W_(k) ^(i)), λ₁≧ . . . ≧λ_(R)>0 denote the positive ordered eigenvalues, and {u_(r)∈C^(|)N_((k))|}_(r=1) ^(R) are the corresponding eigenvectors. Since the best (in the minimum-norm sense) rank-one approximation of W_(k) ^(i) is λ₁u₁u₁ ^(H), the state estimate can be chosen equal to^v_((k)) ^(i):=√{square root over (λ₁)}u₁. Besides this eigenvector approach, randomization offers another way to recover the vector estimate, with the potential to achieve quantifiable approximation accuracy. The basic idea is to generate multiple Gaussian distributed random vectors v˜C

(0,W_(k) ^(i)), and pick the one with the minimum error cost. Numerical Simulations

The performance of the proposed distributed multi-area SE method was tested using the IEEE 14-bus system [Error! Reference source not found.] with the area partition depicted in FIG. 8. All four areas measure their local bus voltage magnitudes, as well as real and reactive power flow levels at the lines marked by the squares, which leads to the augmented local states as indicated by the dotted lassos. There are three overlapping areas, which correspond to the equality constraints enforced in (11.B). The interior-point method based convex solver SeDuMi is used to solve for the centralized SDR-based SE problem, as well as the iterative local matrix update (13.B) in [S1].

To illustrate convergence of the ADMM iterations to the centralized SE solution^V, the local matrix error |W_(k) ^(i)

V_((k))|_(F) is plotted versus the iteration index i in FIG. 9A for every control area k. Clearly, all the local iterates converge (with a linear rate) to their counterparts in the centralized solution. In addition, as the estimation task is of interest here, the local estimation error

v_(k) ^(i)−v_(k)|₂ is also plotted in FIG. 9B, where^v_(k) is the estimate of bus voltages at

obtained from the iterate W_(k) ^(i) using the eigen-decomposition method. Interestingly, the estimation error curves converge within estimation accuracy of around 10⁻² after about 40 iterations (less than 20 iterations for Areas 1 and 2), even though the local matrix has not yet converged. This demonstrates that in practice it is possible to achieve the centralized estimation accuracy with a limited number of iterations, which in turn leads to affordable inter-area communication overhead.

Distributed SE based on SCADA measurements was described for nonlinear AC power systems. To overcome the non-convexity of casting the centralized SE as a nonlinear LS problem along with the convergence challenges of the companion gradient solvers, an SDR-based formulation was leveraged here involving the outer-product matrix of the wanted state. The described multi-area SE approach offers not only computational reduction, but also the potential of achieving global optimality. With minimal data exchanges between neighboring areas, each control center updates its local state matrix with complexity that scales with the number of buses per region, and converges to the centralized solution. The proposed techniques can be readily modified to cope also with alternative convex criteria, which is useful if robustness is at a premium.

FIG. 10 is a flowchart illustrating example operation of a control device, such as energy management system 12 of FIG. 1 or an energy management system for any of the control areas of FIG. 8. In this example, the energy management system receives measurements of an electrical characteristic from a plurality of measurement units (MUs) posited at a subset of the power buses within the power grid (100). As described above, in some cases the energy management system may also receive intermediate results for the estimates from other energy management systems (104).

The energy management system applies semidefinite relaxation to compute (e.g., update) a semidefinite programming model for the power buses within the power grid based on the measurements any intermediate results from other systems, where the model linearly relates the estimates for the electrical characteristic to the measurements of the electrical characteristic (102).

The energy management system may iteratively update the model until determining that a sufficient solution has been reached. This may be determined, for example, upon applying a predefined number of iterations or until convergence is detected (106).

Upon computing the current solution for the state estimates for system variables for its entire respective control area, the energy management system may perform a programmed action, such as generating one or more alerts, generate reports, update accounting or billing systems (108).

FIG. 11 shows a detailed example of various devices that may be configured to execute program code to practice some embodiments in accordance with the current disclosure. For example, an energy management system configured to operate in accordance with one or more of the techniques described herein may be implemented as one or more of the computers 500 of FIG. 10

In this example, computer 500 includes a processor 510 that is operable to execute program instructions or software, causing the computer to perform various methods or tasks. Processor 510 is coupled via bus 520 to a memory 530, which is used to store information such as program instructions and other data while the computer is in operation. A storage device 540, such as a hard disk drive, nonvolatile memory, or other non-transient storage device stores information such as program instructions, data files of the multidimensional data and the reduced data set, and other information. The computer also includes various input-output elements 550, including parallel or serial ports, USB, Firewire or IEEE 1394, Ethernet, and other such ports to connect the computer to external device such a printer, video camera, surveillance equipment or the like. Other input-output elements include wireless communication interfaces such as Bluetooth, Wi-Fi, and cellular data networks.

The computer itself may be a traditional personal computer, a rack-mount or business computer or server as shown in FIG. 10, or any other type of computerized system such as a power grid management system. The computer in a further example may include fewer than all elements listed above, such as a thin client or mobile device having only some of the shown elements. In another example, the computer is distributed among multiple computer systems, such as a distributed server that has many computers working together to provide various functions.

The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. Various features described as modules, units or components may be implemented together in an integrated logic device or separately as discrete but interoperable logic devices or other hardware devices. In some cases, various features of electronic circuitry may be implemented as one or more integrated circuit devices, such as an integrated circuit chip or chipset.

If implemented in hardware, this disclosure may be directed to an apparatus such a processor or an integrated circuit device, such as an integrated circuit chip or chipset. Alternatively or additionally, if implemented in software or firmware, the techniques may be realized at least in part by a computer readable data storage medium comprising instructions that, when executed, cause one or more processors to perform one or more of the methods described above. For example, the computer-readable data storage medium may store such instructions for execution by a processor. Any combination of one or more computer-readable medium(s) may be utilized.

A computer-readable storage medium may form part of a computer program product, which may include packaging materials. A computer-readable storage medium may comprise a computer data storage medium such as random access memory (RAM), read-only memory (ROM), non-volatile random access memory (NVRAM), electrically erasable programmable read-only memory (EEPROM), flash memory, magnetic or optical data storage media, and the like. In general, a computer-readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device. Additional examples of computer readable medium include computer-readable storage devices, computer-readable memory, and tangible computer-readable medium. In some examples, an article of manufacture may comprise one or more computer-readable storage media.

In general, the computer-readable storage media represents non-transitory media. The term “non-transitory” may indicate that the storage medium is not embodied in a carrier wave or a propagated signal. In certain examples, a non-transitory storage medium may store data that can, over time, change (e.g., in RAM or cache).

The code or instructions may be software and/or firmware executed by processing circuitry including one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the term “processor,” as used herein may refer to any of the foregoing structure or any other processing circuitry suitable for implementation of the techniques described herein. In addition, in some aspects, functionality described in this disclosure may be provided within software modules or hardware modules.

Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that any arrangement that achieves the same purpose, structure, or function may be substituted for the specific embodiments shown. This application is intended to cover any adaptations or variations of the embodiments described herein. It is intended that this disclosure be limited only by the claims, and the full scope of equivalents thereof. 

The invention claimed is:
 1. A method for estimating a state for each of a plurality of alternating current (AC) electrical power buses within a power grid, the method comprising: receiving, with an energy management system for the power grid, measurements of an electrical characteristic from a plurality of measurement units (MUs) positioned at a subset of the power buses within the power grid; processing the measurements of the electrical characteristic with the energy management system to compute an estimate of the electrical characteristic currently occurring at each of the power buses within the power grid, the estimates of the electrical characteristic being nonlinearly related to the measurements of the electrical characteristic, wherein processing the measurements of the electrical characteristic comprises: applying semidefinite relaxation to compute a semidefinite programming model for the power buses within the power grid that linearly relates the estimates for the electrical characteristic to the measurements of the electrical characteristic; and iteratively processing the measurements of the electrical characteristic in accordance with the semidefinite programming model to compute a convex minimization as a solution for the estimates of the electrical characteristic; and controlling operation of one or more of the power buses based on the computed estimates.
 2. The method of claim 1, wherein the measurements of the electrical characteristic comprises measurements of one or more of a current phasor, a voltage phasor or a power phasor.
 3. The method of claim 1, wherein the measurements of the electrical characteristic comprises a complex voltage at each of the power buses including: (1) a real power injection into the power grid at the power bus, (2) a real power flow to the power grid at the power bus, and (3) a voltage magnitude at the power bus.
 4. The method of claim 1, wherein the measurements of the electrical characteristic are received from phasor measurement units (PMUs).
 5. The method of claim 1, wherein applying semidefinite relaxation to compute a semidefinite programming model for the power buses comprises defining the set of estimates of one or more voltage phasors, {circumflex over (v)}, such that $\hat{v}:={\arg\;{\min\limits_{v}{\sum\limits_{l = 1}^{L}{{w_{l}\left\lbrack {z_{l} - {h_{l}(v)}} \right\rbrack}^{2}.}}}}$
 6. The method of claim 1, further comprising generating a report that specifies power consumption at each of the power buses within the power grid based on the computed estimates.
 7. The method of claim 6, further comprising updating an accounting system to charge respective operators of each of the power buses based on the computed estimates.
 8. The method of claim 1, further comprising generating an alert based on the computed estimates.
 9. The method of claim 1, wherein the semidefinite programming model includes a state vector indicating a trustworthiness for each of the measurements, and wherein processing the measurements comprises computing the solution by eliminating any outliers of meter measurements based on by trustworthiness of each of the measurements.
 10. The method of claim 9, wherein the state vector identifies outliers including faulty measurements, bad data due to communication error and data that has been compromised due to attack.
 11. The method of claim 9, wherein the semidefinite programming model includes a configurable parameter to control a required level of trustworthiness for the measurements.
 12. The method of claim 1, wherein the energy management system is one of a plurality of energy management systems associated with respective power grids, each of the energy management systems computing estimates of the electrical characteristic for electrical power buses within the respective power grid, the method further comprising: communicating the estimates from each of the energy management systems to a central energy management system; and iteratively processing the estimates from the energy management systems in accordance with a semidefinite programming model to compute a global solution for the estimates of the electrical characteristics for the entire plurality of power grids.
 13. The method of claim 1, wherein the energy management system is one of a plurality of energy management systems associated with respective power sub-grids that are interconnected to form the power grid, each of the power sub-grids including a subset of the electrical busses and representing a control area for the corresponding energy management system, and wherein processing the measurements comprises, with each of the energy management systems, locally computing the state estimates of the electrical characteristic for the subset of the electrical power buses within the respective control area associated with the respective power sub-grid.
 14. The method of claim 13, further comprising: expanding the subset of the electrical power buses for a first one of the control areas to include one of the electrical busses of a second one of the control areas when a transmission line connects the first one of the control areas to the electrical bus of the second control area and the first one of the control areas includes a sensor that provides a measurement associated with the transmission line; and communicating, during the computation of the solution for the state estimates, intermediate results for the state estimates from the second one of the control areas to the first one of the control areas for the electrical bus to which the transmission line is connected.
 15. The method of claim 13, further comprising, for each of the control areas, identifying each transmission line that connects the control area to an electrical bus within a different one of the control areas and includes a sensor that provides a measurement associated with the transmission line; and for each of the identified transmission lines, communicating, during the computation of the solution, intermediate results for the estimates from the second one of the control areas to the first one of the control areas for the electrical bus to which the transmission line is connected.
 16. A device comprising: memory to store measurements of an electrical characteristic from a plurality of measurement units (MUs) positioned at a subset of alternating current (AC) electrical power buses within a power grid; and one or more processors configured to execute program code to process the measurements of the electrical characteristic to compute an estimate of the electrical characteristic currently occurring at each of the power buses within the power grid, the estimates of the electrical characteristic being nonlinearly related to the measurements of the electrical characteristic, wherein the program code is configured to apply semidefinite relaxation to compute a semidefinite programming model for the power buses within the power grid that linearly relates the estimates for the electrical characteristic to the measurements of the electrical characteristic, and iteratively process the measurements of the electrical characteristic in accordance with the semidefinite programming model to compute a convex minimization as a solution for the estimates of the electrical characteristic, and wherein the program code is configured to control operation of one or more of the power buses based on the computed estimates.
 17. The device of claim 16, wherein the measurements of the electrical characteristic comprises measurements of one or more of a current phasor, a voltage phasor or a power phasor.
 18. The device of claim 16, wherein the measurements of the electrical characteristic comprises a complex voltage at each of the power buses including: (1) a real power injection into the power grid at the power bus, (2) a real power flow to the power grid at the power bus, and (3) a voltage magnitude at the power bus.
 19. The device of claim 16, wherein the program code is configured to apply semidefinite relaxation to compute a semidefinite programming model for the power buses comprising defining the set of estimates of one or more voltage phasors, {circumflex over (v)}, such that $\hat{v}:={\arg\;{\min\limits_{v}{\sum\limits_{l = 1}^{L}{{w_{l}\left\lbrack {z_{l} - {h_{l}(v)}} \right\rbrack}^{2}.}}}}$
 20. The device of claim 16, wherein the program code is configured to generate a report that specifies power consumption at each of the power buses within the power grid based on the computed estimates.
 21. The device of claim 16, wherein the program code is configured to update an accounting system to charge respective operators of each of the power buses based on the computed estimates.
 22. The device of claim 16, wherein the program code is configured to generate an alert based on the computed estimates.
 23. The device of claim 16, wherein the semidefinite programming model includes a state vector indicating a trustworthiness for each of the measurements, and wherein processing the measurements comprises computing the solution by eliminating any outliers of meter measurements based on by trustworthiness of each of the measurements.
 24. The device of claim 23, wherein the state vector identifies outliers including faulty measurements, bad data due to communication error and data that has been compromised due to attack.
 25. The device of claim 23, wherein the semidefinite programming model includes a configurable parameter to control a required level of trustworthiness for the measurements.
 26. The device of claim 16, wherein device receives the measurements from a plurality of energy management systems associated with respective power grids.
 27. The device of claim 16, wherein the device comprises a first energy management system of a plurality of energy management systems associated with respective power sub-grids that are interconnected to form the power grid, each of the power sub-grids including a subset of the electrical busses and representing a control area for the corresponding energy management system, and wherein the first energy management system of the device locally computes the state estimates of the electrical characteristic for the subset of the electrical power buses within a first one of the control areas associated with the device.
 28. The device of claim 27, wherein the program code is configured to expand the subset of the electrical power buses for the first one of the control areas to include one of the electrical busses of a second one of the control areas when a transmission line connects the first one of the control areas to the electrical bus of the second control area and the first one of the control areas includes a sensor that provides a measurement associated with the transmission line, and wherein the device computes the solution for the state estimates for the first one of the control areas in accordance with intermediate results for the state estimates received from the second one of the control areas for the electrical bus to which the transmission line is connected.
 29. A method for controlling a power grid with an energy management system by estimating a state for each of a plurality of alternating current (AC) electrical power buses within the power grid, the method comprising: receiving, with an energy management system for the power grid, measurements of an electrical characteristic from a plurality of measurement units (MUs) positioned at a subset of the power buses within the power grid; processing the measurements of the electrical characteristic with the energy management system in accordance with a semidefinite programming model for the power buses within the power grid to compute an estimate of the electrical characteristic currently occurring at each of the power buses within the power grid; and controlling operation of one or more of the power buses based on the computed estimates for the state of each of the power busses. 